<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TechPolicy.ca</title>
	<atom:link href="http://www.techpolicy.ca/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.techpolicy.ca</link>
	<description>Data mining politics and public policy. The politics of data mining.</description>
	<lastBuildDate>Wed, 18 Aug 2010 03:27:22 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Information Flow and Arbitrage in the Political Blogosphere</title>
		<link>http://www.techpolicy.ca/?p=108</link>
		<comments>http://www.techpolicy.ca/?p=108#comments</comments>
		<pubDate>Tue, 17 Aug 2010 22:45:59 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Mathematical Models]]></category>
		<category><![CDATA[Social Media Mining]]></category>
		<category><![CDATA[Tracking Press Coverage]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[oxford]]></category>
		<category><![CDATA[social network analysis]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=108</guid>
		<description><![CDATA[As some of you may know, I recently submitted my dissertation for the MSc in Social Science of the Internet at the Oxford Internet Institute. The dissertation is still being graded, so I don't want to post it here just yet. However, the title of the dissertation is "Information Flow and Arbitrage in the Political [...]]]></description>
			<content:encoded><![CDATA[<p>As some of you may know, I recently submitted my dissertation for the <em>MSc in Social Science of the Internet</em> at the <a href="http://www.oii.ox.ac.uk" target="_blank">Oxford Internet Institute</a>. The dissertation is still being graded, so I don't want to post it here just yet. However, the title of the dissertation is "Information Flow and Arbitrage in the Political Blogosphere" and the abstract is below. <a href="mailto:wojciech@gmail.com">E-mail me</a> if you'd like to discuss it or get a copy of the dissertation.</p>
<blockquote><p>Over the last decade, political blogging has significantly grown in popularity and now represents a popular form of political engagement and information collection. This dissertation explores the political blogosphere in the context of the 2008 US Presidential election. From May 2008 to April 2009, 16,741 blogs were crawled on a daily basis, with their content and hyperlinks stored and analyzed. This dissertation provides an analysis of the flow of information through the blogosphere in the context of this data, through the use of social network analysis. Through a number of network-based methodological approaches, it is shown that the political blogosphere is organized in a core-periphery structure, with popular, elite bloggers organized in the core. The core itself is fragmented, composed of tightly-knit communities and members who are mutually aware of each other. These communities are fragmented and information does not easily flow between them. The periphery acts as a bridge, however, with information flowing to peripheral bloggers from multiple communities.</p>
<p>Through a node-level analysis, the dissertation further concludes that one can define different forms of influence and precursors to influence. Using a statistical model that controls for in- and out-degree distributions of the network, this dissertation is able to identify bloggers that act as information arbiters within their personal networks. A striking finding of the research is that the core is no better at information arbitrage and introducing their personal networks to new sources of information than the periphery; being a popular blogger entrenched in an elite community does not make it easier to promote new sources of information.</p>
<p>Furthermore, this thesis makes significant contributions to methodological work in social network analysis. It provides new approaches to analyzing longitudinal network data, and explores new statistical models for analyzing the significance of different social mechanisms in a network.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=108</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Data Mining for Development: Sales Pitch</title>
		<link>http://www.techpolicy.ca/?p=104</link>
		<comments>http://www.techpolicy.ca/?p=104#comments</comments>
		<pubDate>Mon, 28 Jun 2010 16:43:57 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Mathematical Models]]></category>
		<category><![CDATA[Social Media Mining]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[international development]]></category>
		<category><![CDATA[machine learning]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=104</guid>
		<description><![CDATA[Over the last few weeks, I've been rekindling my interest in mathematics and international development. I studied both subjects during my undergraduate degree, and have kept trying to figure out a way to combine the two. I'm hoping to spend some time over the next two months running pilot projects in this area to see [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last few weeks, I've been rekindling my interest in mathematics and international development. I studied both subjects during my undergraduate degree, and have kept trying to figure out a way to combine the two. I'm hoping to spend some time over the next two months running pilot projects in this area to see how well data mining, artificial intelligence, and statistics can be used to help development agencies, non-profits, and related institutions to do their work. Below is the rough draft of a short sales pitch I am working on in this area. <a href="mailto:wojciech@gmail.com">E-mail me</a> or comment if you are interested in learning more.</p>
<p><strong>Data Mining for Development: Proposal</strong></p>
<p><em>Organizations face two major data-related challenges: (1) how to collect meaningful data about their work, and (2) how to improve their operations and impact using that data. We’re here to help.</em></p>
<p>Today, non-profit organizations and social enterprises work in a challenging environment. With governments focusing on austerity measures and funding bodies receiving lower returns on their financial investments, non-profit organizations are continually facing an uphill battle. They are constantly being encouraged to improve how they operate and prove that they are having a positive effect. </p>
<p>Data Mining for Development (DM4D) is a non-profit project that aims to help organizations and social enterprises achieve their objectives. Composed of a team of graduate students, researchers, and professionals trained in mathematics, statistics, and computer science, DM4D helps organizations in a number of ways:</p>
<ol>
<li>DM4D helps organizations decide how to collect data and run project evaluations at a lower cost. We do this by finding ways to automate the data collection process, and by developing indicators for variables that are difficult to measure.</li>
<li>DM4D researchers are familiar with extracting meaningful information from thousands of documents at a time. Organizations produce countless reports, e-mails, and articles, and we can help make sense of such data.</li>
<li>DM4D builds mathematical models to help predict how the work environments of non-profit organizations are changing, so they can better prepare for unexpected events and changes coming in the future.</li>
</ol>
<p>Collecting and understanding large amounts of information and data is difficult and expensive. However, doing so can help an organization achieve its mission and objectives. DM4D is here to provide services and advice to help organizations leverage their data.</p>
<p>For more information, please contact Wojciech Gryc at <a href="mailto:wojciech@gmail.com">wojciech@gmail.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=104</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prototype: More Web-Friendly Visualizations in R</title>
		<link>http://www.techpolicy.ca/?p=95</link>
		<comments>http://www.techpolicy.ca/?p=95#comments</comments>
		<pubDate>Sat, 12 Jun 2010 17:30:35 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[data visualization]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=95</guid>
		<description><![CDATA[I've spent some more time thinking about how best to put together the package for creating web-friendly, interactive data visualizations in R. I have a pretty substantial JavaScript package that does a lot of basic visualizations now, and it's really exciting to see where this is going. With this in mind, I'm releasing a new [...]]]></description>
			<content:encoded><![CDATA[<p>I've spent some more time thinking about how best to put together the package for creating web-friendly, interactive data visualizations in R. I have a pretty substantial JavaScript package that does a lot of basic visualizations now, and it's really exciting to see where this is going. With this in mind, I'm releasing a <a href="http://www.techpolicy.ca/wp-content/manual/june12/webviz_0.2-1.tar.gz">new version</a> of the R package prototype I keep discussing in this blog.</p>
<p>A number of functions are included here, including <i>wv.plot()</i>, <i>wv.lineplot()</i>, <i>wv.snaplot()</i>, <i>wv.bargraph</i>. The documentation still needs a lot of work, and there are no interactive abilities yet (though they exist in the JavaScript code).</p>
<p>What is most exciting about this package is that a lot of the steps one takes to make a complete graph have been split into individual functions. Thus, while one can make a scatterplot with <i>wv.plot()</i>, one can also use <i>wv.axis()</i> and <i>wv.points()</i> to do so as well. Each data visualization gets its own ID, or can be assigned one, so one can later start passing visualization (e.g. the points in the scatterplot itself) as arguments to other functions, thus allowing one to begin adding functions for interactivity.</p>
<p>A few examples of the visualizations are shown below, along with the necessary R code to get them to display. Note that these are embedded into the blog, I did so through the use of an inline frame.</p>
<p><b>Basic Scatterplot</b></p>
<p>The code below will generate a basic scatterplot.<br />
<code>x = rnorm(30)<br />
y = rnorm(30)<br />
wv.plot(x, y, "~/Desktop/scatterplot", height=300, width=300, xlim=c(-2.5,2.5), ylim=c(-2.5,2.5), xbreaks=c(0), ybreaks=c(0))</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/june12/scatterplot/plot.html", width=350, height=350>Your browser does not support inline frames.</iframe><br />
</center></p>
<p><b>Plot with Multiple Data Types</b></p>
<p>Supposing you want to have a scatterplot with multiple point types and a line. You can build this manually with the following code.</p>
<p><code>x = rnorm(30); y = rnorm(30); z = runif(30);<br />
wv.open("~/Desktop/plot3/", height=300, width=300);<br />
wv.axis(c(-3.5, 3.5), c(-3.5, 3.5), xbreaks=-2:2, ybreaks=-2:2);<br />
wv.points(x, y, xlim=c(-3.5, 3.5), ylim=c(-3.5, 3.5));<br />
wv.lines(sort(x), z, col="red", xlim=c(-3.5, 3.5), ylim=c(-3.5, 3.5));<br />
wv.close();</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/june12/mixedplot/plot.html", width=350, height=350>Your browser does not support inline frames.</iframe><br />
</center></p>
<p><b>Bar Graph</b></p>
<p>This is a new graph format.</p>
<p><code>x = c(2.5, 7, 11);<br />
wv.bargraph(x, cats, "~/Desktop/barplot", ylim=c(0, 15), ybreaks=(1:5)*3);</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/june12/barplot/plot.html", width=350, height=350>Your browser does not support inline frames.</iframe><br />
</center></p>
<p>As always, <a href="mailto:wojciech@techpolicy.ca">comments are welcome</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=95</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Canadian CPI: Visualization Brainstorm</title>
		<link>http://www.techpolicy.ca/?p=88</link>
		<comments>http://www.techpolicy.ca/?p=88#comments</comments>
		<pubDate>Fri, 28 May 2010 00:37:49 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[canada]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[government]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=88</guid>
		<description><![CDATA[After finishing the R prototype for data visualization, I've started abstracting the various methods necessary to create beautiful graphs. While there's no preliminary version of the R package yet, I think I've taken a number of exciting steps. These include: Abstracting graph objects. Objects such as lines, scatter plots, and other graph types can all [...]]]></description>
			<content:encoded><![CDATA[<p>After finishing the <a href="http://www.techpolicy.ca/?p=83">R prototype</a> for data visualization, I've started abstracting the various methods necessary to create beautiful graphs. While there's no preliminary version of the R package yet, I think I've taken a number of exciting steps. These include:</p>
<ul>
<li><b>Abstracting graph objects.</b> Objects such as lines, scatter plots, and other graph types can all be treated in a similar fashion in JavaScript. I use this approach in the new version of the JavaScript graph presented below.</li>
<li><b>Including axes.</b> The last graphs did not have axes, grid lines, and other information cues. These ones do. While they have to be manually set, this presents an advantage in that one can choose which grid lines and axis points to show.</li>
<li><b>Interactivity.</b> The graph below actually has useful interactive features. Mousing over points provides information on the value of the point itself, while mousing over the line plot provides the title. Nothing too complex, but already fairly useful.</li>
</ul>
<p>I chose to present data on the Canadian consumer price index (CPI). This is <a href="http://www.bankofcanada.ca/en/inflation/index.html" target="_blank">freely available data</a> and serves as a reminder of the major political issue of our time... While I don't want to make this post political, the ultimate goal of this blog is to use such visualizations and mathematical models to better understand public policy and the role of data mining therein. Might as well start referencing useful data in this regard.</p>
<p>So without further ado, here's the graph...<center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/may28/lineplot.html", width=450, height=325>Your browser does not support inline frames.</iframe><br />
</center></p>
<p>The next step is fairly clear: making the above possible in R!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=88</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Prototype: Web-Friendly Visualizations in R</title>
		<link>http://www.techpolicy.ca/?p=83</link>
		<comments>http://www.techpolicy.ca/?p=83#comments</comments>
		<pubDate>Tue, 18 May 2010 14:51:23 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[data visualization]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=83</guid>
		<description><![CDATA[Developing web-friendly data visualizations is not very difficult, though as far as I know, a package that allows one to do this directly in R does not exist (e-mail me if you know of one). As someone who has been developing lots of data-oriented software tools, it's always nice to post visualizations online. To facilitate [...]]]></description>
			<content:encoded><![CDATA[<p>Developing web-friendly data visualizations is not very difficult, though as far as I know, a package that allows one to do this directly in R does not exist (<a href="mailto:wojciech@gmail.com">e-mail me</a> if you know of one). As someone who has been developing lots of data-oriented software tools, it's always nice to post visualizations online. To facilitate this task, I've been fooling around with creating a data visualization prototype in R. While the package is very limited in what it does, I hope it'll generate a discussion on the types of visualization tools that could help R users post their work on the web.</p>
<p>At this stage, the package has three functions to illustrate scatter plots, line graphs, and social networks. Each function creates a new directory with all the necessary JavaScript and HTML files. The HTML file could then be embedded using an inline frame (as done below) or used as a standalone website.</p>
<p>You can download the prototype <a href="http://www.techpolicy.ca/wp-content/manual/may18/webviz_0.1-1.tar.gz">here</a>, and below are some examples of visualizations.</p>
<p><b>Scatter Plot</b></p>
<p><code>x = rnorm(25)<br />
y = rnorm(25)<br />
wv.scatterplot(x, y, "/wv-scatterplot", height=300, width=300, marginsize=0.1)</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/may18/wv-scatterplot/scatterplot.html" width=320 height=320 border=1>Your browser does not support inline frames.</iframe><br />
</center></p>
<p><b>Line Graph</b></p>
<p><code>x = -100:100/10<br />
y = sin(x)<br />
wv.lineplot(x, y, "/wv-lineplot", height=300, width=300, marginsize=0.1)</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/may18/wv-lineplot/lineplot.html" width=320 height=320 border=1>Your browser does not support inline frames.</iframe><br />
</center></p>
<p><b>Social Network</b></p>
<p><code><br />
library(igraph)<br />
g <- erdos.renyi.game(15, 0.175)<br />
wv.sna(g, "/wv-sna", rnorm(15, 2, 0.75), width=400, height=400)</code></p>
<p><center><br />
<iframe src="http://www.techpolicy.ca/wp-content/manual/may18/wv-sna/netplot.html" width=420 height=420 border=1>Your browser does not support inline frames.</iframe><br />
</center></p>
<p><b>Next Steps</b></p>
<p>I apologize in advance, as some of the code above may be buggy and it certainly isn't very customizable. The next step -- assuming there's interest -- is to abstract the graph drawing to individual functions so one can then produce multiple graphs in one canvas or frame. Making more options for interactivity, labels, and so on is also a must. Again, comments and suggestions are very welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=83</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Visualizing Networks in JavaScript</title>
		<link>http://www.techpolicy.ca/?p=79</link>
		<comments>http://www.techpolicy.ca/?p=79#comments</comments>
		<pubDate>Tue, 02 Mar 2010 22:52:59 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Social Media Mining]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[politics]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=79</guid>
		<description><![CDATA[Continuing my exploration of JavaScript-based data visualization, I've created a basic network visualizer for the MP data I'm collecting. Below is a social network of all the Canadian federal ministers who have been mentioned together in various press and social media sources in the last week. Your browser does not support iframes. Note that the [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing my exploration of JavaScript-based data visualization, I've created a basic network visualizer for the MP data I'm collecting. Below is a social network of all the Canadian federal ministers who have been mentioned together in various press and social media sources in the last week.<br />
<center><iframe src ="http://www.techpolicy.ca/wp-content/manual/mar2/presstracker-mp.html" width="420" height="420" style="border-style:solid;border-width:1px;"></p>
<p>Your browser does not support iframes.</p>
<p></iframe></center></p>
<p>Note that the size of the node represents the number of articles mentioning the MP in the past week.</p>
<p>If you want the source code or if the visualization does not work, please <a href="mailto:wojciech@techpolicy.ca">e-mail</a> me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=79</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Beautiful Web-Based Graphs</title>
		<link>http://www.techpolicy.ca/?p=73</link>
		<comments>http://www.techpolicy.ca/?p=73#comments</comments>
		<pubDate>Mon, 22 Feb 2010 00:47:11 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Visualization]]></category>
		<category><![CDATA[charts]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=73</guid>
		<description><![CDATA[I regularly show charts on this website, and for the past few days, have been trying to find a good way to do this. Many of the charts so far have been shown as PDF or JPG files. These are fine, but they are not very responsive. Furthermore, many of the packages available for graphing [...]]]></description>
			<content:encoded><![CDATA[<p>I regularly show charts on this website, and for the past few days, have been trying to find a good way to do this. Many of the charts so far have been shown as PDF or JPG files. These are fine, but they are not very responsive. Furthermore, many of the packages available for graphing are proprietary or not open source, and this is a problem for me. I decided to look for something I could live with when it comes to displaying charts and graphs.</p>
<p>Quite a few people have recommended <a href="http://code.google.com/apis/charttools/" target="_blank">Google Charts</a>, which definitely has a lot to offer. However, I also want to customize my charts and make my own chart types (for example, social network illustrations). Another good package is <a href="http://teethgrinder.co.uk/open-flash-chart-2/" target="_blank">Open Flash Chart</a>, but I don't have a Flash license and prefer things to be a bit more open. Finally, there's <a href="http://processing.org/" target="_blank">Processing</a>. This is a great language, but Java applets on a website bug me.</p>
<p>I'm quite picky, but have finally found a useful tool: <a href="http://raphaeljs.com/" target="_blank">Raphaël</a> -- a library meant for representing vector graphics using JavaScript. While they have a <a href="http://g.raphaeljs.com/" target="_blank">graphing library</a>, I decided to write my own code to play around with the library and customize the graphics. Overall, I must say that I am very impressed with the package.</p>
<p>As an example, the chart below shows a bubble plot. While fairly basic, I'm really happy with how easy it is to make <i>interactive</i> charts. Scrolling over the bubbles changes their colour, and adding other features is fairly easy.<br />
<center><iframe src ="http://www.techpolicy.ca/wp-content/manual/feb22/scatterplot.html" width="320" height="320" style="border-style:solid;border-width:1px;"></p>
<p>Your browser does not support iframes.</p>
<p></iframe></center></p>
<p>Another example is a line chart, shown below.<br />
<center><iframe src ="http://www.techpolicy.ca/wp-content/manual/feb22/lineplot.html" width="320" height="320" style="border-style:solid;border-width:1px;"></p>
<p>Your browser does not support iframes.</p>
<p></iframe></center></p>
<p>I'll do my best to improve these charts and make them more interactive and useful. Please <a href="mailto:wojciech@techpolicy.ca">e-mail me</a> if you want the source code.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=73</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Mobile World Congress 2010</title>
		<link>http://www.techpolicy.ca/?p=71</link>
		<comments>http://www.techpolicy.ca/?p=71#comments</comments>
		<pubDate>Sat, 13 Feb 2010 02:24:06 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Business]]></category>
		<category><![CDATA[events]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=71</guid>
		<description><![CDATA[In a few hours, I'm flying to Barcelona for the World Mobile Congress, an annual event that showcases pretty much everything related to mobile technologies. I'm quite excited about this event, as it's bringing together around 40,000 to 50,000 people interested in mobile technologies, business, and related areas. If you're attending and interested in data [...]]]></description>
			<content:encoded><![CDATA[<p>In a few hours, I'm flying to Barcelona for the <a href="http://www.mobileworldcongress.com/" target="_blank">World Mobile Congress</a>, an annual event that showcases pretty much everything related to mobile technologies. I'm quite excited about this event, as it's bringing together around 40,000 to 50,000 people interested in mobile technologies, business, and related areas.</p>
<p>If you're attending and interested in data mining, social network analysis, social media mining, and mobile technologies, feel free to <a href="mailto:wojciech@techpolicy.ca">e-mail me</a>. I'm always open to meeting people!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=71</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Press: Minister Networks</title>
		<link>http://www.techpolicy.ca/?p=63</link>
		<comments>http://www.techpolicy.ca/?p=63#comments</comments>
		<pubDate>Mon, 08 Feb 2010 04:26:44 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Tracking Press Coverage]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[government]]></category>
		<category><![CDATA[policy]]></category>
		<category><![CDATA[press tracking]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=63</guid>
		<description><![CDATA[About a week ago, I discussed tracking Canadian MPs based on the number of times they get mentioned in various news media, and who they get mentioned with. At the time, I only showed a chart of mentions, and discussed some shortcomings of the approaches used for tracking politicians -- or, for that matter, any [...]]]></description>
			<content:encoded><![CDATA[<p>About a week ago, I discussed <a href="http://www.techpolicy.ca/?p=53">tracking Canadian MPs</a> based on the number of times they get mentioned in various news media, and who they get mentioned with. At the time, I only showed a chart of mentions, and discussed some shortcomings of the approaches used for tracking politicians -- or, for that matter, any brands.</p>
<p>I've been working on improving my tracking software and also working on new visualizations. The work has culminated in the network below, and a <a href="http://www.techpolicy.ca/wp-content/uploads/manual/mpnetwork.pdf">high quality PDF version</a> is also available:<br />
<a href="http://www.techpolicy.ca/wp-content/uploads/manual/mpnetwork.pdf"><img src="http://www.11-55.org/wojciechdotca/wp-content/uploads/2010/02/mpnetwork.jpg" alt="" title="mpnetwork" width="383" height="454" class="aligncenter size-full wp-image-64" /></a></p>
<p>This network tracks Canadian federal ministers in various blogs, magazines, and newspapers. The size of the circle with the minister's name represents the number of articles (i.e. the larger the circle, the more articles), while a connection exists between ministers if they have been mentioned together in at least one article or blog post over the last week.</p>
<p>Such a network representation provides very useful information about press coverage of Canadian ministers. A great example is that Prime Minister Stephen Harper gets mentioned very often <i>relative</i> to other ministers, but is not mentioned often <i>with</i> other ministers. Tony Clement or Jim Prentice, on the other hand, get mentioned with more ministers, but have fewer articles about them.</p>
<p>One thing the network does not show, however, is how often the co-mentions occur. It's possible, for example, that a set of five or six ministers was mentioned in one article, and this would create something like the dense set of connections with ministers Flaherty, Prentice, Clement, and others. More information would be necessary to analyze whether this is the case or not.</p>
<p>Stay tuned for more updates on the software. I also hope to have a website set up where this is all done automatically and people can peruse social media surrounding Canadian politics.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=63</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graphs, Maps, and Trees</title>
		<link>http://www.techpolicy.ca/?p=60</link>
		<comments>http://www.techpolicy.ca/?p=60#comments</comments>
		<pubDate>Thu, 04 Feb 2010 16:22:15 +0000</pubDate>
		<dc:creator>Wojciech Gryc</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[information mining]]></category>
		<category><![CDATA[information retrieval]]></category>

		<guid isPermaLink="false">http://www.techpolicy.ca/?p=60</guid>
		<description><![CDATA[I just finished reading Graphs, Maps, and Trees by Franco Moretti. The book was recommended to me by a friend (thanks Tom!) and I must say I really enjoyed it. While the book does not discuss information theory, machine learning, or data mining, it provides a very interesting argument for more rigour in literary studies. [...]]]></description>
			<content:encoded><![CDATA[<p>I just finished reading <a href="http://www.versobooks.com/books/klm/m-titles/moretti_graphs.shtml" target="_blank">Graphs, Maps, and Trees</a> by Franco Moretti. The book was recommended to me by a friend (thanks Tom!) and I must say I really enjoyed it.</p>
<p>While the book does not discuss information theory, machine learning, or data mining, it provides a very interesting argument for more rigour in literary studies. Furthermore, I believe it provides a great introduction to the possibilities that information theory holds for political science, business intelligence, and related fields. A particularly powerful example of this is when Moretti writes,</p>
<blockquote><p>What do literary maps do... First, they are a good way to prepare text for analysis. You choose a unit--walks, lawsuits, luxury goods, whatever--find its occurrences, place them in space... Or in other words, you <i>reduce</i> the text to a few elements, and <i>abstract</i> them from the narrative flow, and construct a new, <i>artificial</i> object like the maps that I have been discussing. And with a little luck, these maps will be <i>more than the sum of their parts</i>: they will possess 'emerging' qualities, which were not visible at the lower level.</p></blockquote>
<p>In this paragraph, Moretti specifically discusses the use of geographical representations of novels to study the patterns behind the stories therein. If we go beyond maps specifically and discuss graphs, trees, networks, and other abstract analytical tools, we can see how using any such tools may illuminate underlying patterns in literary works.</p>
<p>As Moretti discusses at the start of his book, a major challenge to literary research is that reading all the novels published in a specific period is impossible. There is simply too many of them. The use of graphs allows one to analyze such works in aggregate while dealing with the shortcoming of not being able to read as fast as content is produced. Social media and press tracking has a similar challenge. There are too many blog posts, articles, Tweets, status updates, and websites out there for a consultant or researcher to read and aggregate by hand. As such, one needs more abstract frameworks for dealing with the data.</p>
<p>If you are looking for a non-technical introduction to the possibilities held within information retrieval and data mining, this is a great book. While Moretti doesn't discuss automated or algorithmic approaches to his work, the mental leap from his work to automated strategies is short and easy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpolicy.ca/?feed=rss2&amp;p=60</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
