In Sanskrit, the avatar () refers to “an incarnation in human form.” In Roblox, few things reflect a user’s identity more directly than their avatar. As we will discover, there is no “standard” Roblox user, and the incredible aesthetic diversity in our user avatars is a direct reflection of the diversity of the user base itself.
Characterization of Avatars (Methodology)
If we are interested in aesthetic diversity, we need to start by describing the aesthetic of the avatar. The most natural place to look is a 2D avatar thumbnail that usually represents users to each other. For aesthetic analysis, we need to turn this thumbnail into a semantically meaningful numeric representation. There are many ways to reduce the size, but here are a few that we can try.
- The simplest approach: apply directly PCA to the flattened thumbnail image. To evaluate the “quality” of the reduction, we visualize the thumbnails on the poles of the principal components (PCs). We can see that while the first PC distinguishes between conceivable types of avatars, the twelfth is too broad to be meaningful.
PC 1 (14.3% of the explained variance):
PC 12 (1.5% of variance explained):
2. Almost simple: we can apply the last hidden layer of the available image classification network (Resnet 18) and evaluate the embedding quality by grouping them. Observe how Resnet captures color information very efficiently (see all blue shoes in second cluster) but sometimes fails to encode shape information (see first cluster).
The thumbnail samples from the 2 clusters are shown below:
3. To get a visual view of cohesion, we can apply UMAP to reduce all image classification embeddings to 2 dimensions. Although the dosages appear to be clearly discernible clusters, the large blobs on the lower right look suspicious. That’s right: samples from that megacluster are not visually adhesive.
2D embedded plot:
Samples from megacluster in 2D embedded space:
4. Train a custom small variant autocode (VAE) directly on the thumbnail data. Ideally, this better captures the unique aesthetic change in a Roblox avatar, than a general-purpose image classifier. (cute aside: K-means are particularly well-suited to grouping these embeddings, since its precedent usually matches the VAE’s latent variable)
While there are metrics that can attempt to quantify the benefits of different approaches, real-life use cases for unsupervised learning often come down to a subjective assessment. Anecdotally, we see the most success with #4.
The Avatar Manifold
Using VAE, we can transform thumbnails into concise 64-dimensional vectors for clustering. Here are some examples of VAE + K-means clusters from 20-dimensional clustering:
Some very custom avatars in a cluster:
The tall and thin avatar, which we refer to as “Rthro” in another cluster:
The big and cube avatar we call “Blocky” in this cluster:
The default avatar is here:
Slight customization in between Rthro and Blocky body styles in this one:
Roblox’s Dark Angel
“Look over there!”
I believe that I can fly
Consistency of clusters across multiple runs, random initialization, and k selection suggests that Avatars naturally fall into distinct (albeit fuzzy) categories. At the extreme end of the contour, we have the classic “Square” characters, square bodies facing tall, skinny, more lifelike “Rthro” avatars. We also found some default avatars that users haven’t edited since joining Roblox (cluster 4 above). In between, there’s everything from “ninja goth” to “going to the club.”
Identity through avatar
How do these aesthetic clusters relate to our own users?
The easiest place to start is user behavior on the platform. When charting last month’s avatar edits, account age in weeks, total seconds of playtime, and month-long retention by cluster – engagement metric – we were presented with four graphs. illustrations dramatic variation between clusters. Users with heavily customized avatars tend to engage the most and are retained most often, while avatars that are not heavily customized tend to be less interactive.
There are two contradictory causal explanations for this. One is that users who edit their avatars become more engaged with Roblox. Another reason could be that users who have invested in Roblox tend to put more effort into their avatars as time goes on. There is great work by others at Roblox determine which interpretation to trust.
Regardless of causality, we find that the two aspects of platform identity – aesthetic representation and degree of interaction – are inextricably linked. What about off-platform identities? How do users’ real-life identifiers – age, geography, gender, etc. – match their Roblox identities? Check out Part 2 of this blog post to find out!
Nameer Hirschkind is a Data Science Intern at Roblox. He works on Roblox Avatars to help every player create an Avatar they love. Neither Roblox Corporation nor this blog endorse or endorse any company or service. In addition, no guarantees or promises are made as to the accuracy, reliability or completeness of the information contained in this blog.
© 2021 Roblox Corporation. Roblox, the Roblox logo, and Power of Fantasy are among our registered and unregistered trademarks in the United States and other countries.