These are some datasets for our labs.
Also hosted in Amazon S3 here: https://s3.amazonaws.com/elephantscale-public/data/data.zip
The base URL for S3 is : https://s3.amazonaws.com/elephantscale-public/data/ And append the path to the above base. e.g. https://s3.amazonaws.com/elephantscale-public/data/house-prices/house-sales-sample.csv
See instructions in README-dev.md
- AirBNB rentals NYC 2019
- US presidential election contribution data - 2012 and 2016 election contributions. Decent size data
- Wine reviews - text reviews and ratings from wine magazine
- Diabetes data
- Wine reviews - text reviews and ratings from wine magazine
- Streaming movies data
- College Admissions - Pretty simple data
- Churn - predict customer turn over in Telecom
- Credit Card default
- Wine quality - estimate wine quality from 1-10
- Customer tickets data
- Sarcasm data - news stories. Sentiment analysis
- Wine reviews - text reviews and ratings from wine magazine
- AudioScrobble
- Cars
- Churn
- ClickStream
- College Admissions
- Commodities
- Credit Card default
- Diabetes data
- Economic Numbers
- Election Contributions
- House Prices
- IMDB Metadata
- JSON Data
- Misc
- Morgage Applications
- MovieLens Recommendations
- Netflix Recommendations
- NYC Flight Delays
- NYSE
- Presidential Election Contributions
- Propser Loan
- SF Crime
- Spam SMS
- Spark Commit Logs
- Stock Market Data
- Text
- Tips
- Uber data for New York City
- Walmart Trip Types
- Zipcodes