Copywriting · Legal

Is it ok to copy public WEB info into my Data Base

Ramūnas Jurgilas CEO @ UnknownName

February 4th, 2020


I am creating data base with companies contact details and some resources files. All this data will be publicly available for free throw my search engine. I have concerns what is allowed to copy and what is not. Is it allowed to copy companies details (which are listed bellow) from they web site?

  1. Company address
  2. Contact e-mail
  3. Contact phone number
  4. Take social networking (like: Facebook, Instagram, Youtube) URLs.
  5. Small business description.
  6. Logo
  7. Photos
  8. Working hours

I noticed, that some companies do not have copyright disclaimer and some of them have this: © 2019 Company Name | All Rights Reserved. Can somebody to elaborate what I can copy and what can not copy from these websites? If someone can elaborate on different markets like EU (European Union) & US. Best, Ramunas

Dane Madsen Organizational and Operational Strategy Consultant

February 4th, 2020

In the US, the data is owned by the individual business. They have the sole control if their data is displayed on your site. There was a significant lawsuit in the 90's filed by TELCO yellow page companies against independent publishers who were accumulating books and then hand keying the data (for accuracy) or scanning the data into a new book format, that claimed the work product (the book itself) was copyrighted. The Telcos claimed this because of the cost and work effort of accumulating the data.

The courts ruled against them, ergo the data was scanned and compiled into new competitive books. That same ruling exists in the US today. You can scrape the data and republish it without the data being the issue. However, your risk in the US is the T & Cs of the site, which most always list this as a violation of their terms. If they catch you, they can file against you on that - not the data itself, but the scraping. Be careful and smart.

In the EU, the compiled work (the book itself) - again, going back to the yellow pages days - was protected (often the publisher was a state owned Telco like DT or FT). This prevented competitors from doing the same thing and forced the startups to build their own database (ironically, now protected). I am not aware if there has been a change to that decision.

You should find competent legal counsel in every country before scraping to avoid the problem, or you can license a database from one of the leaders (in the US, Infutor is the largest - and good people - contact Brian Wool). They are expensive with some databases from companies like Axiom costing as much as $750 K per year. Worse, they have a high degree of inaccuracy due to the lag in accumulating, or the human errors in the process.

In the end, if you start this way, you need to embrace that databases do have value and need to plan to build your own in some manner. Further, buying or scraping data is just the start. Keeping it current and clean for your users is the critical dimension for users. If they get bad data (depending on your use) just a few times, they will not return.

Happy to talk if you would like.

Paul Garcia marketing exec & business advisor

February 6th, 2020

@Dane is very right that before you gather and repackage someone else's information you need to speak with an attorney in each region to give you relevant and accurate information about what is and is not allowed, especially if you intend in some way to profit from the repackaged information. Remember that the web won't limit who can get to your information, so although your "customer" target might be US or EU, making data available online may have complications globally, particularly around the way you're encouraging people to use your directory.

There are specific laws that discuss email scraping as a separate issue. Your content collection may be subject to different rules if you're displaying information that are not the same as if your content ever gets used to contact these companies with a solicitation. So the purpose of your database is very important to understand. I see you said free, but the question is also about how the free information helps your business.

We're mostly guessing at the boundaries of your enterprise. That's why you need actual legal advice.

Edward de Jong Software designer and developer, programming language designer

February 9th, 2020

Basic company information is public information, and nobody is going to give a damn about that. Phone numbers change quite a bit, and easily go out of date unfortunately. The URL's for companies are very stable. Logos and Photos are absolutely going to get you in trouble. Logos are trademarks and you are not permitted to use Apple's logo for example. Using someone else's photos is a blatant violation of copyright and you should not do it. As for working hours, those change very slowly; the biggest problem is that holidays are not often notated, but if consumers rely on this information it could lead to unfortunate situations. The description of a company, if you steal it from Hoover's or one of the other company's like that, that is not good. I don't know how many companies you are planning to do, but this sounds like a wonderful AI project for a college student, to figure out a company description by studying a website.