Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.
EU General Data Protection Regulation (GDPR) German: EU Datenschutz-Grundverordnung (DSGVO)
Dr_Sommer
✭✭
Hi, as aspected, the panic begins...
Intro:
So with the new EU General Data Protection Regulation, there is a lot to do to keep the data of the users save... especially with on eye on the possible penaltys...
So I decided to start this Discussion for all possible Questions and Answers in this direction
0
Comments
So my first Question:
After turning off IP logging, as RJ (Not) recommended, I proudly announced in my forum, that I`m killing it in therms of GDPR...
BUT.. a friend of mine then checked my Website and found this:
So can somebody tell me what this thing does and is it storing Userdata?
I just did turn it off with:
$Configuration['Garden']['Analytics']['Enabled'] = false;
should that be enough?
ThX for any info on that...
First, you are allowed to log IP addresses under the GDPR. That modification does not change your compliance with it. (I think not logging IPs is a great move if you can do it, I just don't appreciate confusing GDPR with general privacy improvements.)
Second, the analytics tick is what increments your page view data. If you removed it entirely, it would stop pages views from changing.
Third, disabling that config setting will kill your Dashboard statistics. We do centrally store some basic site data; we do not store personally identifiable information. It's the bare minimum needed to generate the graphs and know what software versions are being used to run Vanilla (which informs decisions we make about basic requirements).
If you want actually useful information about how the GDPR effects forums, I suggest this resource we put together: https://blog.vanillaforums.com/community-answers-to-common-questions-about-gdpr-community-forums
There is a lot of misinformation about what GDPR means for websites. My tl;dr is this: Vanilla has always been super responsible and minimalist in its data collection, so the impact on folks running Vanilla sites is minimal (unless they've added a bunch of third-party Javascript widgets that collect more data).
@Linc thanks for putting that online. It would be extremly helpful if your company can share experiences with the community in case you will get requests based on the GDPR, how you have reacted and if that was sufficient.
I just wanted to add a link to give a solid foundation to some of your assumptions: https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en
For user generated content ("UGC") the blog post wasn't really straight forward. The link above makes it perfectly clear:
Therefore deleting a user and keeping the content is enough (with respect to the fact that the UGC may include information which sets the UGC in identifyable relation to the author).
I think some practical advices might be helpful. One of this is how much time there is to react on any request. See Art. 12 (3) for that:
Although an action should be taken immediately, there is a one month period granted without any problems. Especially for small forums it should be easily explainable that it needs more time to react appropriately which would give a three month reaction period.
There is a list of examples given in the first link for what personal data is and based on that list and what I know about Vanillas database structure, I would say that for a default Vanilla installation those tables need to be looked at:
The easy ones
Possibly containing some personal data like names, IP- or email-addresses
The tedious work
User generated content can include personal data which has been inserted by the user (or in worst case some other user)
Four examples that show why it is quite hard to follow users wishes to be "forgotten":
1. User has written his phone number in a comment. You would have to either delete all of his content or look through all of his content to search for things like that
2. User has written his phone number in a comment and later on edited that comment to delete the number. It will still reside in GDN_Log!
3. User "John Doe" has written his phone number in a comment and another user has quoted that comment. Even if you delete all content of John Doe, his name and number will still be in your forum
4. A comment of user "John Doe" has been flagged by another user who commented on that "John Doe is never learning that he is not allowed to post things like that!" which would result in a connection between the user id in GDN_Flag and the users identity
And that is no Vanilla specific problem, this is a "user is allowed to enter free text"-problem. But I have faith that the courts will take such problems into consideration when a forum admin tries to "forget" a user and e.g. not all quoted comments are wiped or there is some server log that still holds an IP address of that user.
Finding user related content could be easier, if there was some Facebook like feature: when writing some text, nearly every word is matched against the user names (that must cause a horrible lot of traffic!) and if a user is found, there is a connection made. Therefore it should be relativly easy to find every post that mentions another user even if the mentioning wasn't an intention but only happened "on the fly".
But I really cannot imagine a way to do that without causing ridiculous high traffic. That might be okay for desktop users, but on mobile that would be a no-go. But there might be some magic which would make it possible and cause responsible traffic usage, too.
At least writing mentions down to database could be a start so that those posts can be found quite easily. I've once started a plugin but never finished it. Making that an in-built would be a step to make it easier to find "user related" content and it would look nice in the users profile below "Discussions", "Comments"
@R_J My position is it is the user's responsibility to identify comments that they feel are compromising to their identify, and then moderators will take appropriate action. I'm very against 1) Putting an onus on moderators to conduct manual audits, 2) Automatically editing other users' posts, even if they quote or mention them, or 3) Blanket deletion of all content by a user.
What one person thinks is personal info another may disagree. I think it's appropriate for the user requesting removal to make that judgement and ask the moderators directly. (That request process I think would be great for automation.)
The only case I would edit another user's posts is if they'd quoted particularly identifiable information and it was brought to my attention. Writing a tool that auto-edits others' posts en masse is super dangerous; we usually won't even do that for migrations.
I fully agree on that!
The judges will tell us that. But here is the reason why I doubt that:
That obviously includes UGC.
Let's assume I want to be forgotten here and you leave my comments online. In more than one of those comments I have a link to one of my GitHub repos with my name. That comment is by definition personal data and as such shall not exist any more when I ask for erasure of my data.
A user doesn't have to specify which personal data has to be erased, he could simply demand "erase all data" and in the example above my comment with the link would be a problem
All those information related parts of the GDPR should be no problem, but what really will cause huge efforts is a person that likes to be forgotten. I don't see a better alternative than dumping the database and do some grep-magic on it.
This is actually a great example, because I contend that LINK is not personal data. The personal data is that you've added your REAL NAME to your GitHub account, which is their GDPR concern, not the forum's.
My position is that removing personal data on a forum means removing their IP & email addresses - the definitive personally identifiable information. All other modifications are up to the user to specify what they think is identifiable. No tool in the present or foreseeable future can intelligently parse thousands of comments and definitively determine if something within them is personally identifiable. What if you happen to describe the location of your house using entirely landmarks in the course of an essay? You think even a human is going to figure that out if you ask them to skim 3000 comments? It's absurd on the face of it.
@Linc: It`s just a small Forum, so if its no problem for the forumsoftware to shut down IPs and the analytics tick, its no problem for me...
Yes indeed, there is a lot of misinformation about GDPR... so I decided to not make me more crazy, as I allready am...
I just wanted a more or less clean start... an as RJ pointed out, If somebody wants something from us, we have about a month/ 3 Months to react... this should be enough...
ThX for this discussion, it helps...
I may have over read this part in this discussion. But from what i understood regarding the GDPR... removing the last two digits of IP addresses seems to be legit for processing IPs according to GDPR rules. My hosting company did this for example. Or am I wrong?
Vanilla has about 4 fields per user in the GDN_User that deal with IPs. Would it make sense to have a plugin that would strip the last two digits or exchange them for a "XX". What could be affected by changes like this? Core functionalities? Plugins like "StopForum SPAM"?
A good read (in German) with links to some examples you might need for your page can be found here: https://www.heise-regioconcept.de/online-marketing/dsgvo-fuer-website-betreiber
The main action item for you is to inform your users. And you have to do that, no matter what other actions you take.
If you have stored your data securely before, you do not need to change anything. You must inform how you ensure the security of the data.
It doesn't make sense to snip some numbers from the IP address. Your hosting company might do this to reduce the personal data that they store, but they have a lot more information available which is far more sensible than your IP address e.g. your bank account.
And the same goes for your forum. If you allow signing in with Facebook, you store your users Facebook identity in your database and make it visible to others, too. For me as a user that would be far more interesting than the fact that the forum own knows the IP address.
@RJ:
But I think, if you anonymize the IPs, GDPR isn`t aplyable any more...
Source
Where's the sense in masking IP addresses when you still have to store users mail addresses?
If you manage to anonymize each and every user data, you would still have to inform the users about that. That's GDPR.
But I think the rest of the DSGVO wouldn't apply to you (and all your users would be called "Anonymous")
@R_J: I‘m with you on this. I thought that for example using Vanilla statistics could then be left out of the inform your users part as personal info would then not be traceable to a person anymore.
Also one can say that it is interesting to offer anonymization to users where possible and it‘s a good move if you can tell your users we anonymize yor data where possible > IPs.
I still think that masking IP addresses should not be done on the script side but on the server side.
I've done a little research and it seems as if you would need additional modules for Apache to do so. I think that might be a reason why this no way for people using shared hosting services.
If for some reason you want to alter the IP address, you can do so by adding something to the
/conf/bootstrap.before.php
which is better than my previous "advise", to alterclass.request.php
Quite often I use different browsers/private mode to open my test forum the same time as admin, a normal user and a guest. There is no problem with that although all three browser sessions have the same IP which makes sense since Vanilla identifies users by the browsers cookie.
You can exchange the users IP with some fake IP like that.
"8.8.8.8" is the IP of a public DNS server iirc, but who cares since your users will not see this address.
I do not see any advantage to store crippled IPs over the above approach since wrong information is wrong information, but if you like to, you can do it like that:
Some servers might add additional headers with the users IP and therefore I would not only alter the REMOTE_ADDR but also look for other appearances of the IP address.
Preventing Vanilla from "knowing" the IP address will keep you from looking through each and every table.
Correct me if I`m wrong, but I have 2 Kind of Users:
So the masking of the IP is for the Anonymous Users, that come to visit my Site..
The registered Users or new Users, that want to register have to read the "Therms of use" and the "data protection specifications" for my Site an have to confirm it!
For them I could store personal information, IP, etc..
As far as I know, Vanilla doesn't store guests IP addresses.