• Seilevel Team

    Here’s the Team

    Welcome to our Seilevel Photo Op.

An Exercise in Documenting Current State: Reddit and the Decision Tree

An exercise in documenting current state: reddit and the decision tree

In an earlier post I detailed how you can use RML models to document the current state of your systems and processes by using a Business Data Diagram to illustrate the hierarchy and relationship of objects on the popular content website Reddit.com. In this post we’ll continue to document the functionality of Reddit by showing how posts are populated and posted in a Subreddit view by using the Decision Tree.

If you’ll recall Reddit has many communities for which users can post materials and these are called “Subreddits”. Users can subscribe to Subreddits and it’s from these subscriptions that a user’s “Front Page” has content populated. There are also various rules depending on how you are filtering the content. There is “Top”, which lists all content from your subscriptions based on the total number of votes per post in a descending order. The “New” lists content based on submission time in an ascending date. “Rising” is a function of the “New” list that is populated based on a mixture of submission time and an algorithm determining which posts may become popular. The “Controversial” is filtered based on an algorithm that weighs the closeness of upvotes and downvotes a post has received. The “Hot” view is determined by an algorithm which attempts to show posts based on what people are enjoying right now.

This is a perfect opportunity to use a Decision Tree to illustrate how the system may populate a supposed view since there are a set number of questions to determine what you are seeing when you access a view.

Reddit Hot Algorithm Decision Tree

It should be noted that this is a very high-level explanation of Reddit’s hot algorithm (a deeper explanation can be found here), but it gets to the gist of how the process works: from a list of posts from a list of Subreddits each post is evaluated on the criteria of age and velocity of votes. Posts that reach the 10 vote threshold after an hour are weighted more heavily than posts with less than 10 votes in an hour, or under an hour.

Every model in the RML toolset can be great to provide your team with references, or to assist in the creation of playbooks that new team members can utilize when spinning up on a project.

Leave a Reply

Your email address will not be published. Required fields are marked *