Blog Map

By: Kathy Gerber
Published On: 12/17/2006 5:16:45 PM




Imagine a sketch of a social network, for example, an extended family.  It isn't difficult to extend that imagery of relationships to the various progressive blogs in Virginia and the size of the population is about the same. I've no plans to make pursuing this a high priority, but I decided to go ahead and write it up in a diary for general interest.

Just to work through the process, I wrote down the first 9 blogs that came to mind.  Then without gathering any new information, I assigned scores ranging from 1-9 for my subjective perception of their relationships, actually using 2-7.  Little or no information was a 5, very strong collaboration a 2, and recurring history of conflict beyond debating the issues earned a 7.

Not only does the fact that I am "centered" blogwise at RK bias my perception, but it also skews the subset of inter-blog interaction that is likely to come my way.  Obviously, it follows that it would be a good idea to develop a more objective but rapid measure for determining scores.
The usual blog metrics like hits and user participation probably impact very weakly in using a bigger data set.  If a blog is new or very quiet, then they are more likely to receive a score near 5.  Another factor that is neutralized is content quality. Actually, strong contemplative writers may be penalized in this system for focusing on their work.

As a network, the data were represented by the figure pictured above.  All nodes (blogs) were equally spaced as the vertices of a regular 9-gon, and the illustration gives no useful information. That occurred because the system is over-determined and inconsistent in Euclidean space.  For example, if d(A,B)=1, d(B,C)=7, and d(A,C)=2, it is impossible to sketch in the usual way.

If that sounds nerdy, all it means is that it is not possible to stand next to someone who is holding hands with a third person and be really, really far away from the third person at the same time.

So using the same numbers, I tried clustering and that worked much better.  The agnes has to do with the fact that this is an agglomerative technique, and what happens is that nodes that are "close" get lumped together, and the process iterates.

Here 5 and 8 are shown to be close, and somewhat removed from everyone else.  3, 4 and 7 are a little distant. What would emerge from including more blogs would be a better sense of whether or not a particular blog is isolated.  Richer "friend" groups would be likely to emerge as well.

The cluster plot below is also fairly informative.  The underlying algorithm is different as it attempts to divide the 9 nodes into two groups based on the scores.

This is a snapshot in time, and it's reasonable to anticipate that there would be more activity near the intersection during campaign season, reflective of a higher degree of collaboration.

The point of this exercise is not only to generate a crude blog map, but also to determine a reasonably rapid approach for assessing some attributes that may not be so very obvious, e.g., inclination to network or isolation over time.

Extending this model to encompass Virginia political blogs in general without regard to party would not require much work at all; it only requires assessing the relationships.  And it would certainly make the map more interesting. The scale is very comfortable, and an expansion to something like the leftyblog universe would be a little messy and lots of work.

But the really interesting work would bring in the time dimension and perhaps other variables.  Imagine a graph like the one to the right made into a movie with frames at evenly spaced time intervals. Event driven discontinuities would probably be visible.  The macaca incident would sharpen the divide between party affiliates, and there would be certain dispersion responses within groups.

This kind of modeling may appear to run counter to several existing views.  For example a partisan scoop in this scenario is of little interest.  What is of greater interest, however, is a better understanding of response to events in the aggregate. 


Comments



Wow. (Neal2028 - 12/17/2006 7:22:37 PM)
This all made my head hurt, but it is very interesting.


Sorry (Kathy Gerber - 12/17/2006 10:31:27 PM)
about the headache part. I really want to change some things and coerce the network idea. Whatever, the end results should be fairly intuitive.  There are so many caveats and conditions though, and now they are out of the way.

Thanks for the feedback -



head hurt (drmontoya - 12/17/2006 11:23:00 PM)
me too!


Interesting analysis (JPTERP - 12/17/2006 11:50:41 PM)
What is the endgame of the analysis? 

Is this a tool for quantifying biases relative to the larger universe of political blogs?

Or is this a way to measure the impact of events on a blog's readership?  (e.g. does an incident like the macaca incident  increase participation on a blog, but result in a blog becoming more partisan?)

Not sure that I understand the bigger picture at work here.  However, seems like there's quite a bit of food for thought. 



There are many possibilities (Kathy Gerber - 12/18/2006 1:40:25 AM)
it's more about getting a better understanding of structures and relationships beyond folklore or an idiosyncratic take on institutional knowledge.  There may be surprises.

There are roughly two ways of improving predictive successes  -  guessing with a little more information or guessing a little more wisely.

So I'm assuming incomplete information, but "enough" information for better informed decision making.  Yes, there's value in networking, insider info, living on IM, on and on. There's also value in having a handle on the natural dynamics inherent in the society within which all of that occurs.

I guess that's still vague.



So Kathy, what was your Doctorial Thesis on? (Used2Bneutral - 12/18/2006 5:55:32 AM)
The last time I saw a dission like this it inlvolved a bottle of good scotch and Isaac Asimov at a Mensa gathering. It was based on the effects of the telephone and the prospect of some new fangled device back then called the "Cell Phone"..... some of my earliest cell phone bills were well over $1500/month in 1984 dollars.....


spelling..... discussion (Used2Bneutral - 12/18/2006 5:56:23 AM)


That's unreal. (Kathy Gerber - 12/18/2006 8:33:07 PM)
$1500 is an outrageous amount for a phone bill.  Who in the world were you talking with??

I never did finish because of a family tragedy.  But honestly, it surprises me to see a few people who did finish yet seem to be chronically bitter and miserable.

We got our first cell phones this year :)



Not quite clear (JPTERP - 12/18/2006 7:41:41 AM)
Still sounds vague. 

It sounds like you're trying to discover a method for making more informed decisions by filtering out biases found within a network or community?  (e.g. a method which could be applied to any network or community--not just a political one).

If so, it's a tough, if not impossible, assignment. 

I think the awareness that you're assuming "incomplete information" is a good starting point.  I think the premise though that there is such a thing as an "optimal" choice is pretty slippery.  We are always making the best possible choice given the information that we have.  Sometimes we chose poorly or rightly, and that informs future decisions.  Sometimes a good choice in the short-run ends up being a bad choice in the long run.

If you define optimal choice in a very limited, quantifiable sense, you can do it.  If you're talking about what stock is going to perform best over the next decade, you could set up a model for this which would inform your decision about where to invest your money.

But if you're talking about which politician will serve the country best over the next 4 years, you're necessarily doing a lot of guess work based on lived experience.  The best choice comes from having an open-mind, and exchanging ideas openly.  Of course "the best" is entirely subjective.  For some G.W. Bush is "the best" because he talks about morality a lot.  For others, they would trade a moralizing president any day for one who makes wise economic and foreign policy decisions that benefit themselves, their friends, and their neighbors.  When you're talking about an "optimal" choice here the standard is entirely subjective.

At least that's where I'm guessing you're headed with this.  I could be completely off the mark. 



Suppose you already made your choice. (Kathy Gerber - 12/18/2006 9:46:18 PM)
Now you're faced with the task of

supporting a candidate = influencing the outcome



Here's another stab (JPTERP - 12/18/2006 8:09:18 AM)
So the goal of the exercise is to create a high quality environment (a collaborative one) with a great diversity of opinion through social network mapping?

I guess two questions would be:
1. How would you define collaboration.  Does this have to do with frequency of use, level of involvement, or some other description?
2. A reoccuring history of conflict beyond debating the issues, at least from my subjective point of view would be best defined by NLS.  I enjoy Ben's blog--I also think there's probably a correlation between the wide range of viewpoints Ben is able to attract to his site, the scandal heavy format (occassionally interrupted by statistical analysis), and the tendency of the discussions to disintegrate into ad hominem attacks.

RK and NLS both serve different purposes.  Occassionally the two purposes intersect.

I could be completely off-the-mark here.  It's an interesting topic though.  I'm still trying to figure out how to apply the broad principles to specific examples.

 



Collaboration (Kathy Gerber - 12/18/2006 10:12:26 PM)
An example would be giving credit or cross-referencing postively. Vivian Paige has written about blacknell.net which I for one bothered to read because of it.

Or JC encouraging pj to start her own blog.  Things like that get a 2.  They both sometimes post on RK, and I would consider them friends of RK, etc. 

Yes, NLS and RK are qualitatively different. I would have to guess that NLS would be near the intersection of bipartisan grouping. If Ben posts negatively about another blogger, then that pair would get a high score. All of the back and forth in the comments is too much to track so it becomes - for the purpose of this model - a non-factor. 

If I recall correctly, NLS wasn't particularly isolated.  But there is one question that this kind of tracking could decide over time: is isolation a leading indicator of blog death? That information would be useful to a candidate who may be considering establishing a working relationship with an isolated blog.



About networks (Kindler - 12/18/2006 10:30:40 PM)
Okay, Kathy, no more doobies for you...

Seriously, this reminds me of the method Google uses to determine which websites get the highest hits -- based on what other sites track to them.  This is also how ant colonies work -- wherever the most ant trails are (as marked by pheremones)is where the honey is, and pretty soon the whole colony is beating a path in that direction (as I well know from watching my own kitchen).

So isolation does indicate a lack of success.  Also interesting to see how blogs connect to each other to form separate cliques, while there will always be a few that connect the cliques (e.g., NLS, which tends to link to conservative as well as progressive blogs). 

I read an interesting book, Nexus by Mark Buchanan, about how the question of "six degrees of separation" -- why and how are all human beings linked so closely -- is solved by modeling showing how the close-knit networks we inhabit are linked by a limited number of "connectors" who bridge the gaps between us.  Ditto blogs, I'm sure.

Neat stuff.  Thanks for sharing the raw contents of your brain.