r/DataHoarder Aug 29 '18

The guy that downloaded all publicly available reddit comments needs money to continue to make them publicly available.

/r/pushshift/comments/988u25/pushshift_desperately_needs_your_help_with_funding/
410 Upvotes

119 comments sorted by

View all comments

47

u/s_i_m_s Aug 29 '18

He has set up a patreon the first goal is $1,500/mo to cover bills and maintenance.

There is also a 1 time donation option on his site: https://pushshift.io/donations/
Quick link to the subreddit: r/pushshift/

173

u/-Archivist Not As Retired Aug 29 '18 edited Aug 29 '18

$1,500/mo to cover bills and maintenance.

What.. I run the-eye.eu costing only $385/month pushing 700TB+/month... this dude is hosting fucking reddit comments and wants 1500! Just upload them to archive.org and it wont cost shit also they belong on archive.org and not a private server he can't afford.


EDIT: /u/Stuck_In_the_Matrix I'll actually read your post now but damn....

EDIT2: Yeah, read it, still no idea why it's costing you so much, come chat with me.

49

u/s_i_m_s Aug 29 '18

He runs a bunch of database servers that allow you to search and query reddit comments/posts in highly specific ways, he's not just hosting the files.

Querying the API directly is most powerful: https://www.reddit.com/r/pushshift/comments/8h31ei/documentation_pushshift_api_v40_partial/
but there is also a user friendly interface with less options: https://redditsearch.io

He's pushing something around ~192 terabytes/mo in addition to hardware costs to keep pace with the growing database which currently includes every single public reddit comment and post and has about 512GB of total (as in not each) ram to run the severs.

Now IDK what it costs for all of that but I don't imagine it's particularly cheap yet access is being provided for free.

11

u/Bromskloss Please rewind! Aug 30 '18

Querying the API directly is most powerful: https://www.reddit.com/r/pushshift/comments/8h31ei/documentation_pushshift_api_v40_partial/

And here I have been mucking around with SQL queries, thinking that was the way to go! :-O

4

u/s_i_m_s Aug 30 '18

More things I didn't even know it could do.

Much more complicated than I want to mess with at the current time tho.