By: Mike Johnson Jr

Date Created: July 25, 2016, 9:20 p.m.

The Coontown Breakdown 2: Electric Boogaloo - More Data, Less Prose

 

 

Audioburn, 7/17/2015

Part 1: The Coontown Breakdown: A Brief Overview of the Habits of Coontown Users

Introduction

Backed by popular demand, here is The Coontown Breakdown 2: Electric Boogaloo. I'm back with new sample, with a size of ~1000 (995 to be exact) with debugged code, and with the addition of saved usernames as well as subreddit visit frequency (as opposed to karma scores).

This time around, I invite you all to make your own observations from the data, because, as I've learned from part 1 of this study, my own observations can skew perceptions and conclusions.

/r/Coontown is again omitted from the visualizations (it always comes #1 by a longshot in these studies). If you'd like to see, the raw JSON datasets do not have /r/coontown omitted.

Data



 

Code

Here is the code that made it happen:

	
		import praw
		import json
		users = []
		submissions = []
		r = praw.Reddit(user_agent='africanawiki')
		subreddit = r.get_subreddit('coontown')
		#get submission object ids
		for i,submission in enumerate(subreddit.get_hot(limit=350)): #takes around 3-4 hours
			print 'getting submission object %s' % (i)
			submissions.append(r.get_submission(
				submission_id=submission.id))
		root_comments = []
		for i,s in enumerate(submissions):
			print 'getting comments %s of %s' % (
				i, len(submissions))
			for c in s.comments:
				root_comments.append(c)
		def get_comments(comments,level):
			for i,c in enumerate(comments):
				try:
					print 'getting comment count: %s in level %s' % (
						i,level)
					if c.author.name not in users:
						users.append(c.author.name)
				except AttributeError:				
					print 'nada'
				if hasattr(c,'replies'):
					level += 1
					get_comments(c.replies,level)
				
		get_comments(root_comments,0)
		kb_submissions = {}
		kb_comments = {}
		for idx,username in enumerate(users):
			try: 
				print 'getting info for %s, %s of %s' % (username,idx,len(users))
				user = r.get_redditor(username)
				submissions = user.get_submitted(limit=None)
				comments = user.get_comments(limit=None)
				for s in submissions:
					subreddit = s.subreddit.display_name
					kb_submissions[subreddit] = (
						kb_submissions.get(subreddit, 0) + s.score) #changed s.score to 1 for freq
				for c in comments:
					subreddit = c.subreddit.display_name
					kb_comments[subreddit] = (
						kb_comments.get(subreddit, 0) + c.score) #changed c.score to 1 for freq
			except:
				print 'user deleted his/her account, smart'
		karma_by_subreddit = {
			'submissions':kb_submissions,
			'comments':kb_comments,
			'users':users,
		}
		#save object to disk as json 
		with open('coontown_breakdown.json','w') as fp:
			json.dump(karma_by_subreddit,fp)
	

Knock yourselves out analysing other subreddits.

You can find more at my github repository, which also contains the code for my open source data visualization app Agile which I used to help visualize these charts above (+Highcharts).

Raw Data

Here are some links to the raw data (warning: auto-download):

Peace

Thanks for viewing, this is the final Coontown analysis I'll be doing. I hope you enjoyed. More visualizations on other interesting topics coming soon.

code_black()