Research Datasets: Difference between revisions
m Admin moved page Research-datasets to Research Datasets |
Tag: wikieditor |
||
(3 intermediate revisions by the same user not shown) | |||
Line 12: | Line 12: | ||
== Finding Datasets == | == Finding Datasets == | ||
* [https://datasetsearch.research.google.com/ Google Data Set Search] | |||
* [https://scholar.google.com/schhp?hl=en Google Scholar] Google Search Power for Academic Writing | |||
* [https://www.jstor.org/ JSTOR] digital library of academic journals, books, and primary sources | |||
* [https://www.researchgate.net/ Research Gate] Massive Database of Academic Journals | |||
* [https://www.google.com/search?q=site%3A.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork] for Academic Research Resources | |||
* [https://elicit.org Elicit] AI journal search. | |||
* [https://osf.io/ Open Science Framework (OSF)] is a free and open-source platform designed to support research and collaboration across the research life cycle. | |||
* [https://github.com/awesomedata/awesome-public-datasets Awesome Public Datasets] A curated list of over 40,000 public datasets across various topics. | |||
* [https://github.com/hslatman/awesome-threat-intelligence Awesome Cyber Threat Datasets] provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard. | |||
* [https://www.reddit.com/r/datasets/ r/datasets] | |||
* [https://www.reddit.com/r/Dissertation/ r/Dissertation] | |||
* [https://www.reddit.com/r/AskAcademia/ r/AskAcademia] | |||
* [https://www.reddit.com/r/GradSchool/ r/GradSchool] | |||
<span id="queries-for-datasets"></span> | <span id="queries-for-datasets"></span> | ||
=== Queries for Datasets === | === Queries for Datasets === | ||
<pre class="copy-search"> | <pre class="copy-search">"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu</pre> | ||
<span id="case-studies-and-projects-using-datasets"></span> | <span id="case-studies-and-projects-using-datasets"></span> | ||
=== Case Studies and Projects Using Datasets === | === Case Studies and Projects Using Datasets === | ||
* [https://github.com/vizdata-f21/project-2-tidy_team?tab=readme-ov-file Global Deaths by Risk Factors Github Repo] interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation. | |||
== Known Datasets == | == Known Datasets == | ||
{| class="wikitable" | {| class="wikitable sortable" | ||
|- | |- | ||
! style="text-align: left;"| URL | ! style="text-align: left;"| URL | ||
Line 58: | Line 46: | ||
! Region | ! Region | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://library.si.edu/research/free-databases-and-collections Smithsonian Library Resources] | ||
| This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | | This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 64: | Line 52: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [http://cross-sub.org/ CrossSub]<br>[https://yz-data.shinyapps.io/xsub/ Alt Link] | ||
| micro-level, subnational event data on armed conflict and contention around the world | | micro-level, subnational event data on armed conflict and contention around the world | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 70: | Line 58: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://acleddata.com/#/dashboard ACLE] | ||
| real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | | real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | ||
| style="text-align: right;"| N | | style="text-align: right;"| N | ||
Line 76: | Line 64: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://osmp.airwars.org OSMP] | ||
| Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | | Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 82: | Line 70: | ||
| N/A | | N/A | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://liveuamap.com/ LiveUA] | ||
| factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | | factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 88: | Line 76: | ||
| Ukraine | | Ukraine | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://observatoriodeviolencia.org.ve/news/annual-report-violence-2023/ Venezuelan Violence Data] | ||
| | | | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 94: | Line 82: | ||
| LATAM - Mexico | | LATAM - Mexico | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.google.com/search?q=inurl%3Ahttps%3A%2F%2Fsimplemaps.com%2Fdata%2F' World City Database] | ||
| Database of cities with information of population and general Lat Long | | Database of cities with information of population and general Lat Long | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 100: | Line 88: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://tradingeconomics.com/indicators TradingEconomics] | ||
| Mass database of metrics and indicators by country over time | | Mass database of metrics and indicators by country over time | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 106: | Line 94: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://gitnux.org/topics/statistics/crime-and-safety-statistics/ GitNux Crime Reports] | ||
| Crime reports and stats | | Crime reports and stats | ||
| style="text-align: right;"| | | style="text-align: right;"| | ||
Line 112: | Line 100: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://radar.cloudflare.com/ Cloudflare Radar] | ||
| A view of outages, threats, rankings and more based on the massive amount of cloudflare data | | A view of outages, threats, rankings and more based on the massive amount of cloudflare data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 118: | Line 106: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://vision.in.tum.de/data TUM Data] | ||
| Large collection of data sets for computer vision research | | Large collection of data sets for computer vision research | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 124: | Line 112: | ||
| | | | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://repositorio.cepal.org/server/api/core/bitstreams/2db8feef-29d6-4981-9741-9ad3154d3789/content CEPAL Cyber Attacks] | ||
| Cyber Attacks in LATAM | | Cyber Attacks in LATAM | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 130: | Line 118: | ||
| LATAM | | LATAM | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.oecd.org/investment/statistics.htm OECD] | ||
| Foreign Direct Investment (FDI) Statistics | | Foreign Direct Investment (FDI) Statistics | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 136: | Line 124: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://data.worldbank.org/ World Bank Data] | ||
| Economic Datasets | | Economic Datasets | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 142: | Line 130: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://bimportal.scottishfuturestrust.org.uk/page/roi-calculator Scottish Futures Trust ROI Calculator] | ||
| Calculator that allows the user to calculate the expected return on investment of a building project | | Calculator that allows the user to calculate the expected return on investment of a building project | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 148: | Line 136: | ||
| | | | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.numbeo.com/cost-of-living/ Numbeo] | ||
| cost of living calculator and comparison tool. Useful for determining the average price around the world. | | cost of living calculator and comparison tool. Useful for determining the average price around the world. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 154: | Line 142: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://ai.reportlinker.com/pricing Reportlinker] | ||
| AI enabled Market Intelligence Platform | | AI enabled Market Intelligence Platform | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 160: | Line 148: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.kaggle.com/datasets Kaggle] | ||
| Data repository with many datasets for competitions | | Data repository with many datasets for competitions | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 166: | Line 154: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.youtube.com/ YouTube] | ||
| Video Searchable Database of Machine Learning Videos | | Video Searchable Database of Machine Learning Videos | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 172: | Line 160: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.amazon.com/AWS/Amazon-S3 Amazon S3] | ||
| Various datasets provided by Amazon Web Services | | Various datasets provided by Amazon Web Services | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 178: | Line 166: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://r-nd.ami.btu.de/ AMI Data Set] | ||
| This dataset comprises data from a wide range of sources including the finance sector. | | This dataset comprises data from a wide range of sources including the finance sector. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 184: | Line 172: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://bostondata.org/ Boston Data] | ||
| Boston city datasets | | Boston city datasets | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 190: | Line 178: | ||
| Boston | | Boston | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.socialeurope.eu/ Social Europe] | ||
| Social Data | | Social Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 196: | Line 184: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.census.gov/ US Census] | ||
| Census Data | | Census Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 202: | Line 190: | ||
| USA | | USA | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://data.gov/ US Government Data] | ||
| Government Data | | Government Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 208: | Line 196: | ||
| USA | | USA | ||
|- | |- | ||
| style="text-align: left;"| | | style="text-align: left;"| [https://www.data.gov.uk/ UK Government Data] | ||
| Government Data | | Government Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y |
Latest revision as of 00:57, 12 September 2024
Datasets
Finding Datasets - Advanced Google Query for Datasets Known Datasets
Datasets to use for Research (See Research wiki)
Find additional datasets behind a login in the community dataset section
Finding Datasets
- Google Data Set Search
- Google Scholar Google Search Power for Academic Writing
- JSTOR digital library of academic journals, books, and primary sources
- Research Gate Massive Database of Academic Journals
- Google Dork for Academic Research Resources
- Elicit AI journal search.
- Open Science Framework (OSF) is a free and open-source platform designed to support research and collaboration across the research life cycle.
- Awesome Public Datasets A curated list of over 40,000 public datasets across various topics.
- Awesome Cyber Threat Datasets provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard.
- r/datasets
- r/Dissertation
- r/AskAcademia
- r/GradSchool
Queries for Datasets
"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu
Case Studies and Projects Using Datasets
- Global Deaths by Risk Factors Github Repo interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation.
Known Datasets
URL | Comments | Free (Y/N) | Category | Region |
---|---|---|---|---|
Smithsonian Library Resources | This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | Y | Academic | Global |
CrossSub Alt Link |
micro-level, subnational event data on armed conflict and contention around the world | Y | Conflict | Global |
ACLE | real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | N | Conflict | Global |
OSMP | Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | Y | Conflict | N/A |
LiveUA | factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | Y | Conflict | Ukraine |
Venezuelan Violence Data | Y | Conflict | LATAM - Mexico | |
World City Database | Database of cities with information of population and general Lat Long | Y | Country Data | Global |
TradingEconomics | Mass database of metrics and indicators by country over time | Y | Country Data | Global |
GitNux Crime Reports | Crime reports and stats | Crime | Global | |
Cloudflare Radar | A view of outages, threats, rankings and more based on the massive amount of cloudflare data | Y | Cyber | Global |
TUM Data | Large collection of data sets for computer vision research | Y | Cyber | |
CEPAL Cyber Attacks | Cyber Attacks in LATAM | Y | Cyber | LATAM |
OECD | Foreign Direct Investment (FDI) Statistics | Y | Finance & Business | Global |
World Bank Data | Economic Datasets | Y | Finance & Business | Global |
Scottish Futures Trust ROI Calculator | Calculator that allows the user to calculate the expected return on investment of a building project | Y | Finance & Business | |
Numbeo | cost of living calculator and comparison tool. Useful for determining the average price around the world. | Y | Finance & Business | Global |
Reportlinker | AI enabled Market Intelligence Platform | Y | Finance & Business | Global |
Kaggle | Data repository with many datasets for competitions | Y | Machine Learning | Global |
YouTube | Video Searchable Database of Machine Learning Videos | Y | Machine Learning | Global |
Amazon S3 | Various datasets provided by Amazon Web Services | Y | Data Storage | Global |
AMI Data Set | This dataset comprises data from a wide range of sources including the finance sector. | Y | Data Storage | Global |
Boston Data | Boston city datasets | Y | City Data | Boston |
Social Europe | Social Data | Y | Social | Global |
US Census | Census Data | Y | Demographics | USA |
US Government Data | Government Data | Y | Government | USA |
UK Government Data | Government Data | Y | Government | UK |