Dataset
Google Dataset
Dataset Search enables users to find datasets stored across the Web through a simple keyword search. The tool surfaces information about datasets hosted in thousands of repositories across the Web, making these datasets universally accessible and useful. Google
Google dataset browse the entire web, which may render all following links redundant. Yet you may find yourself in the situation where you want to use different, more specialized, tools. If this is the case, the rest of this page is for you.
Kaggle dataset
Browse datasets from kaggle with this straightforward tool, that does what it says. Although kaggle datasets are likely to come up on Google Dataset, you can limit your search to a website you trust, with an nice UI displaying information such date of upload, dataset size, type of files, etc.
Academic Torrents
We've designed a distributed system for sharing enormous datasets - for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds. Contact us at contact@academictorrents.com. Academic Torrents
- Lo, Henry Z. and Cohen, Joseph P., (2016). Academic Torrents: Scalable Data Distribution. Neural Information Processing Systems 2015 Challenges in Machine Learning (CiML) Workshop. http://arxiv.org/abs/1603.04395
- Cohen, Joseph P. and Lo, Henry Z., (2014). Academic Torrents: A Community-Maintained Distributed Repository (p. 2:1–2:2). New York, NY, USA: ACM. http://doi.org/10.1145/2616498.2616528
Additional links
-
A living document of openly available data sets on the internet in a variety of domains/formats for use in data science and machine learning projects/analyses. Also useful for educational exercises and examples. github:hopelessoptimism/datasets
-
This list of a topic-centric public data sources in high quality. They are collected and tidied from blogs, answers, and user responses. Most of the data sets listed below are free, however, some are not. github:awesomedata/awesome-public-datasets