2 posts tagged “social networking”
The Idea
Ok, that's a bit of a mouthful, but why not. Pixel Corps big wig and TWiM/MBW host Alex Lindsay has this dream for a website (in his case a video website) that can look at how you rate videos and compare that to how others rate videos and then when other people with similar rating trends as you rate things highly, you''ll be shown the things that they rated highly. That way you'll only get content that's good. You know it's good because people who like the same things you like said it's good.
Well I think this is a fantastic idea. In fact, I think it should be done for more than just video, it should be done for anything and everything. So I've come up with a basic algorithm for how this could be done. I offer this as an open spec for how this type of system.
Tracking Rating Trends Between Two People
Suppose we have two people, Person 1 and Person 2. To track how these two people's rating patterns compare, four pieces of data are required:
1: The number of times both people agreed that something was good,
2: The number of times both people agreed that something was bad,
3: The number of times Person 1 thought something was good, but Person 2 thought it was bad,
4: The number of times Person 2 thought something was good, but Person 1 thought it was bad.
I'll label these "Good to Both", "Bad to Both", "Good to 1", and "Good to 2", respectively. In order to track this, we need to track how people rated a particular item. When Person 1 rates something, his opinion is recorded. Later when Person 2 rates it, Person 1's opinion is compared to Person 2's new rating, and the appropriate number (Good to Both, Bad to Both, Good to 1, Good to 2) is incremented. This provides data about how any two people's rating patterns compare.
Deciding Who to Track
When a person rates a piece of content, the process of tracking the rating trends is performed for pairs formed by that person and everyone else who's rated this piece of content. If two people have never rated the same item before, there won't be any record of the 4 numbers listed above, so a new record would need to be created. This creates a pairing between the two people. Once that's done, or if the two people already have a pairing, then the record is modified to reflect the new information.
Recommending Content
Once a person rates an item, there is now the potential to recommend that item to everyone that he has pairings with, excluding those that voted on the item (because they've already seen it). To generate a recommendation, we look at those four numbers, and on the way that the person rated it. Assuming the person who is going to make the recommendation is Person 1, we need to look at two numbers.
If Person 1 liked the item, then the important number is Good to 1 + Good to Both. This number is the total number of items that Person 1 thought were good. From this we know what fraction of those Person 2 also thought were good: Good to Both / (Good to 1 + Good to Both). This fraction tells you what proportion of things Person 2 liked, when Person 1 liked them. If this proportion is above a certain user-definable percentage, then you'll be shown the item.
Advising Against Content
Alternatively, we can look at the other number, Bad to 1 + Bad to Both, and look at the proportion of times Person 2 agreed that something was bad: Bad to Both / (Bad to 1 + Bad to Both). This tells us how many times both people agreed disliked a piece of content. When Person 1 rates something as bad, we can determine how likely Person 2 is to also think it's bad, and again if the proportion is above a user-definable percentage, we can specifically prevent the item from being recommended to Person 2.
Anti-Recommendations
A convenient side effect of this way of generating recommendations is that it doesn't matter if Person 1 and 2 completely agree or completely disagree in what's good and what's bad. If Person 2 always likes the things that Person 1 dislikes, the system will recommend what Person 1 dislikes to Person 2, and vice versa if Person 2 always dislikes what Person 1 likes. This occurs because of the thresholding technique.
Suppose Person 1 rates something as good, but the total proportion of things that Person 2 likes of the things that Person 1 likes is very low, below the threshold, that item will be hidden. This makes sense because if the proportion is low, that means Person 2 likes very little of what Person 1 likes. Conversely, if Person 1 rates something as bad, but the proportion of things that Person 2 likes when Person 1 dislikes them is high, then the item is recommended. This also makes sense, because Person 2 usually likes what Person 1 dislikes.
By making recommendations based on number of items that Person 2 likes, relative to the particular rating by Person 1, we can make smart decisions about what is and isn't going to be liked by Person 2. And you can see that it doesn't matter how similar to people are; even people with completely opposite tastes will be good sources of new content: If they have similar tastes, recommend the good items and hide the bad, if they have opposite tastes, recommend the opposite items. The more similar or dissimilar the better, because the proportions of liked items will be closer to 100%.
Database Model
Table 1: Item Opinions
Item ID, User ID, Rating (Good/Bad)
Table 2: Pairings
User 1 ID, User 2 ID, Good to Both, Bad to Both, Good to 1, Good to 2
Table 3: Users
User ID, ..., Threshold
Algorithm
Rate an Item:
for each Item Opinion where Item ID == this item {
if Pairing where (User 1 ID == this user and User 2 ID == this Item Opinion User ID)
or (User 1 ID == this Item Opinion User ID and User 2 ID == this user) does not exist {
create pairing
}
current pairing = Pairing where (User 1 ID == this user and User 2 ID == this Item Opinion User ID)
or (User 1 ID == this Item Opinion User ID and User 2 ID == this user);
if this Rating == Good and this item Opinion Rating == Good {
current pairing good to both ++
} else if this Rating == Bad and this item Opinion Rating == Bad {
current pairing bad to both ++
} else if this Rating == Good and this Item Opinion Rating == Bad {
if this user == current pairing User 1 ID { current pairing good to 1 ++ }
else { current pairing good to 2 ++ }
} else if this Rating == Bad and this Item Opinion Rating == Good {
if this user == current pairing User 1 ID { current pairing good to 2 ++ }
else { current pairing good to 1 ++ }
}
}
Recommending an Item:
for each Pairing where (User 1 ID == this user or User 2 ID == this user) and (User 1 ID and User 2 ID != User ID of current Item Opinions ) {
if this pairing User 1 ID == this user {
if this Rating == Good {
if Good to Both / ( Good to Both + Good to 1 ) > this user threshold { make recommendation }
else { recommend against }
} else {
if Bad to Both / ( Bad to Both + Bad to 1 ) > this user threshold { recommend against }
else { make recommendation }
}
} else {
if this Rating == Good {
if Good to Both / ( Good to Both + Good to 2 ) > this user threshold { make recommendation }
else { recommend against }
} else {
if Bad to Both / ( Bad to Both + Bad to 2 ) > this user threshold { recommend against }
else { make recommendation }
}
}
}
The other day I was talking with a friend, trying to gauge why he uses MySpace and why he's so attached to it. What is so desirable about MySpace? It looks horrible, makes it damn near impossible to customize, and is full of ads, so why are so many people enthralled with this service?
The answer turns out to have two parts. First is why they join: All of their friends are there too, so why not them? The second is why they stay: You try migrating your entire online presence away from MySpace; vendor lock in is a bitch. This second point really is true of every hosted site service, so it's not uncommon, but it's definitely important. To see why, we have to first look at the first point.
When you create your MySpace site, one of the biggest concerns is networking with your friends. You do this by finding their sites on MySpace and simply clicking a link to add them as a friend. But what if your friend is on LiveJournal, Blogger, or Vox? Sorry, only MySpace members can be your friend. The same is true of so many other services. So ofcourse people are joining MySpace, you've got to in order to be able to network with your friends who are already on MySpace.
But this, as I said, is true of every hosted service, so why is MySpace itself so popular? Who knows. Perhaps it had to do with the extra content it provided early on. I don't really know. But someone had to come out on top and that was MySpace. The second point now becomes easier to understand. If all of your friends are on MySpace, and you can't network with outsiders, then noone's going to leave MySpace because that would amount to leaving their social network. Not only that but you'd loose are of your customizations, etc. It's really impossible to make your online presence portable.
It essentially is that the hassle of the migration process makes it not worth doing. This is true of so many things, including switching from PCs to Macs. But does it have to be this way? I don't think so. I doubt that many of the relevant companies would support the idea, but if there were an open API for certain things like site profiles, preferences, style/layout, blog content, and most importantly, social networks, then I think the migration process could be easier.
I'm sure that the companies would fear that it would reduce their member count, and maybe this is true, but I suspect that it could only hurt you if you're mistreating your customers. If you treat your customers like people, and provide them with the tools they need to do what they want with their online presence, then I think they'll be loyal to your service, even if those tools include utilities to make it easy to leave your service, or interact with other services without forcing people to join.
The key to this, as I said earlier, is the social networking element. Things like blog posts can be accessed via RSS of a sort, so it's not a huge issue to migrate them over. But the social network is difficult for a few reasons.
The obvious feature of social networking is being able to have a list of friends. This by itself isn't difficult, it's just a link to their site, basically. But along with that usually comes things like displaying their profile picture and their name or nickname automatically. Getting this data would require a standard API for accessing their site and retrieving name or nickname and profile picture. Putting this into an RSS feed located somewhere like "../PsygnisFive/profile/feed" would make the process immensely more easy.
I imagine the process would go something like this: You click an add link which uses JS to pop up a little lightbox-like container with two fields: your service (Vox, in my case here), and your username on that service, or perhaps just the location of your personal site on the service. Then it would direct you to a REST-ful url, in my case perhaps "psygnisfive.vox.com/neighborhood/add". The GET or POST data would contain the URL of your friend's page, and that would be used to then get whatever info your service needs to add a person to your friends list.
Now that you've entered your service and your username, this gets stored in a cookie, so the next time you click the add button in someone's site, it's all done instantly.The service would then attempt to send a message back to the other user's page indicating a befriending, maybe by posting to something like "../messages/befriendedBy", and the procedure would be complete. I don't see there being any technical reason why this can't be done.
Some other issues to resolve would be the ability to make your site friends only. Doing this might require an implementation of the OpenID system so that when you want to view your friends site, you don't need an account on their service, they just need to have you as a friend because your service acts as your ID server. Again cookies could be used to make this process invisible after the first time. This might even promote something of a blending of your online profile, your OpenID, and other forms of online identification like vCards, into something that makes a lot more sense, a truly unified online identity.
But current systems don't support this stuff, so how can people migrate over currently? Well it's not going to be easy at first, this is a given. The friends-only issue won't be solved until people support some standard. But adding friends can be done in a semi-manual sort of fashion by simply pointing your service to their site and having their data scraped. This would create atleast a one way network, connecting you to people on closed services. Getting the reverse to work will be something that the service must do themselves