Data Interpretation

What can I expect from the social media data analysis? How can I interact with the results? How can I store them for later? And most importantly: what do I learn?

Our social media data science tools compute and derive miscellaneous parameters, statistics, metrics etc. All data are visualised in several graphs and tables using the outstanding Python-based libary Bokeh. Currently, three graph types are present: General Statistics, Pair-Plots and Networks.
All graphs have similar, interactive elements on the right side. The top icon is the direct link to the Bokeh library. Below you find different features, like zooming in and out, panning the graph, reset all settings, or saving the final figure for your documents or presentations. Per default, the panning, tapping and hover elements are activated. E.g. this allows you to pan the figure, to select items or to get more information about certain parts of the graphs by hovering over them. Several tables add further information and summarise the data. Click on the table headers to sort the values accordingly. A download button allows you to get the data for later offline use.
In the following, each element is explained in more detail. Click on the corresponding element, you are interested in. The shown data are based on an analysis of around 500 Instagram posts, uploaded in Stuttgart (location ID: 213128338).


General Statistics

The General Statistics section provides an overview of the used hashtags and the distribution of the number of used hashtags. If you search for a simple and quick analysis of the social media hashtag content these graphs and tables are your first choice. Use the popular hashtags for your posts and check how many hashtags are currently used. Some communities rely on only a few hashtags and overusing them appears as "click-baiting" or unwanted "mass posts".

This graph shows the used top 10 hashtags in a pie chart. Hover over the elements to get more information, namely the hashtag name, the number of total counts and the over all usage percentage in the top 10. The color corresponds to the provided legend and by clicking on the names in the legend you can mute the according colors.

This table shows the top 30 hashtags with the corresponding number of counts.

This distribution shows how many hashtags have been used in average. At least 1 hashtag is necessary and the maximum is 30. The counts are shown on the vertical axis and the actual number per histogram bar can be obtained by hovering over it. Click on the bars, that you do not want to mute. Hold shift and click on several bars to emphasise certain bars.

This table summarises the histogram graph content.


Go to top

Pair-Plots

The Pair-Plots are matrix-like graphs to support you in finding hashtag pairs. Hashtags are rarely isolated features in social media (more on this later in the section Networks). The graphs here help you to find hashtag-pairs, that are often posted together.

This graph shows you the top 30 hashtags in a matrix-like pair graph. Hover with your mouse over the square to obtain some information. You see the hashtag pair names, that are listed on the vertical and horizontal axis and the number of counts. To provide an easy visual overview of these counts, we colored the squares accordingly. Red squares are the top pairs. In this case it is #0711 and #Stuttgart with 67 counts. The median over all pairs is 4 and is white colored. Values below the median are blue colored and rarely mentioned.
Click on the squares to mute all other squares and hold the shift button to mark more squares. If you want to restore the original view, click on the reset button.

This table provides the counts of the 435 visualised pairs. The first and second column provide the hashtag names and the last column lists the counts.

This Pair-Plot lists the hashtag on the horizontal axis and the average number of used hashtags in a post on the vertical axis. The color code is similar as described before. As you can see, in Stuttgart, #Stuttgart, #0711 and #Germany are frequently used in posts with a varying number of hashtags. Hashtags, that are way more common in several communities like #happy, #photooftheday, #fitness etc. are used mostly in "mass posts" with 30 hashtags, trying to reach everyone. One could e.g. use theme-relevant hashtags in a decent way and later add some of these "mass hashtags".
Click on the squares to mute all other squares and hold the shift button to mark more squares. If you want to restore the original view, click on the reset button.

950 combinations can be seen in the previous matrix graph. The pairs and corresponding counts are listed in this table. The first two columns indicate the hashtag pair names, and the last column provides the number of counts.


Go to top

Networks

Network graphs are the most common visualisation technique in social media or other group related subjects. Information shown in the section General Statistics and the Pair-Plots can be derived by these networks to focus on certain aspects and information. We provide you a sophisticated graph analysis section, that allows you to fully understand the relations between hashtags in a big picture. These terms are used in the following: node - in this case a hashtag, represented by a dot, edge - the connections between the nodes.

The tab Network provides an overview of the top 50 hashtags and their relations to each other. Each dot represents a hashtag (the name is shown above each dot). The size of the dot correlates with the number, how often this particular hashtag has been used. It is an additional visual representation of the red-white-blue colors shown before, that are also used in this plot. As a reminder: white - average (median) number of mentions; blue - less than average number of mentions; red - more than average number of mentions. As we already have seen before, Instagram posts uploaded in Stuttgart use mostly the hashtag #Stuttgart. This hashtag can be seen in the center of the network. It is the dominant tag of the network. Hovering over the hashtag shows a list with the number of total hashtag counts and number of connections to the other hashtags. Connections between the hashtags visualised with solid black lines. Thick / in-transparent and thin / transparent lines correspond to a large and small number of connections, respectively. The number of connections can be found in the Table (Summary). Clicking on the hashtag dots mutes all non-related connecting lines. Holding the shift button allows the user to click on several hashtags and reveal more structures.
This network helps you to find groups and sub-groups and supports you in identifying less relevant hashtags, that are "far away" from the center. E.g. have a look (and click) on #fashion, and you see, that the Stuttgart Instagramers use next to this tag also #fitness, #oots (outfit of the day) and #fashionblogger.

The Table (Summary) lists the top 100 connections of the network. The first and second columns list the hashtag pair names. The third column shows the number of connections between these pairs and the last two columns show additionally how often the particular hashtags have been used.

Graph theory, so the mathematical background of the shown networks, offers a lot of mathematical background to analyse groups, certain points, the connecting lines, etc. This table is useful to understand the networks more quantitatively. However, this table can be skipped, if you are not interested in detailed mathematical derivations.
Degree Centrality
The degree centrality indicates in a range between 0 and 1, how many links respectively edges a node has to other nodes in a network. A value of 1 means, that it is connected with all present nodes and 0 indicates an isolated node. In our case: a hashtag, that is used with no other tags in a post. As you can see, #Stuttgart and #0711 are the main nodes in Stuttgart, while #stuttgartgram appears less connective. Closeness Centrality
The closeness centrality indicates in a range between 0 and 1 the reciprocal sum of the shortest lengths of a path of edges from one node to all other nodes. Less complicated formulation: the smaller the value the less "closer" the node is related to other nodes. E.g. #stuttgartliebe. This hashtag has relatively speaking a small closeness centrality. In the provided network graph it appears far away from the center, creates no "own" center and requires additional hashtags, to e.g. connect with #fitness or #summer. #stuttgartliebe is in this case a "supportive" hashtag e.g. for #0711 or #fashion and should not be your main hashtag, but underline your Stuttgart-related post.
Betweenness Centrality
How relevant is a hashtag? Can I remove or not add a particular hashtag? These questions can be answered with the betweenness centrality. Qualitatively speaking the value indicates: The higher the betweenness the more the node "bonds" the network. As we already have analysed, #Stuttgart and #0711 are the main hashtags in Stuttgart. Not using these tags when posting something in Stuttgart may lead to a complete neglect of your posts. Removing these hashtags is may more critical than removing e.g. #deutschland or #stuttgartliebe.

Which hashtag are strongly connected with each other? Which of them build a "clique" or group, that are all bond with each other? If you want the top 5 groups of hashtags (here: in Stuttgart) this table supports you to choose a group, or at least a few members of a strong group. The first column indicates the hashtag. Column 2 to 6 indicate the top 5 groups (descending) and shows whether a hashtag is part of the group (yes) or not (no). Click on the table headers to sort by the requested group.


Go to top