social media

CS Table: Serendipity and Computing

On Friday, 3 October 2014, at CS Table, we will consider an intersection between computing and the arts, exploring the ways in which recommender systems can create experiences of serendipity. Alex Dodge, the College's Artist in Residence, will join us for the discussion.

Iaquinta, L., Gemmis, M. De, Lops, P., Semeraro, G., & Molino, P. (n.d.). Can a Recommender System induce serendipitous encounters?, 229–247. Read sections 1, 2, 3, and 4 (read further optionally). Available online at http://cdn.intechopen.com/pdfs-wm/10158.pdf.

Today recommenders are commonly used with various purposes, especially dealing with e- commerce and information filtering tools. Content-based recommenders rely on the concept of similarity between the bought/searched/visited item and all the items stored in a repository. It is a common belief that the user is interested in what is similar to what she has already bought/searched/visited. We believe that there are some contexts in which this assumption is wrong: it is the case of acquiring unsearched but still useful items or pieces of information. This is called serendipity. Our purpose is to stimulate users and facilitate these serendipitous encounters to happen.

Sun, T., & Mei, Q. (2012). Unexpected Relevance : An Empirical Study of Serendipity in Retweets. Read sections: Intro, Related Work, and Definition (read further optionally). Available online at http://www-personal.umich.edu/~qmei/pub/icwsm2013-sun.pdf.

Serendipity is a beneficial discovery that happens in an unexpected way. It has been found spectacularly valuable in various contexts, including scientific discoveries, acquisition of business, and recommender systems. Although never formally proved with large-scale behavioral analysis, it is believed by scientists and practitioners that serendipity is an important factor of positive user experience and increased user engagement. In this paper, we take the initiative to study the ubiquitous occurrence of serendipitious information diffusion and its effect in the context of microblogging communities. We refer to serendipity as unexpected relevance, then propose a principled statistical method to test the unexpectedness and the relevance of information received by a microblogging user, which identifies a serendipitous diffusion of information to the user. Our findings based on large-scale behavioral analysis reveal that there is a surprisingly strong presence of serendipitous information diffusion in retweeting, which accounts for more than 25% of retweets in both Twitter and Weibo. Upon the identification of serendipity, we are able to conduct observational analysis that reveals the benefit of serendipity to microblogging users. Results show that both the discovery and the provision of serendipity increase the level of user activities and social interactions, while the provision of serendipitous information also increases the influence of Twitter users.

The readings are available outside of Science 3821 or from Sam Rebelsky.

Computer science table is a weekly meeting of Grinnell College community members (students, faculty, staff, etc.) interested in discussing topics related to computing and computer science. CS Table meets Fridays from 12:10-12:50 in the Day PDR (JRC 224A). Contact Sam Rebelsky rebelsky@grinnell.edu for the weekly reading. Students on meal plans, faculty, and staff are expected to cover the cost of their meals. Students not on meal plans can charge their meals to the department.

CS Table: Privacy, Anonymity, and Big Data in the Social Sciences

On Friday, 26 September 2014, at CS Table, we will consider some recent ethical issues with the use of "Big Data" in social sciences research, including data from xMOOCs (Massive, Open, Online, Courses). Our reading will include a short article from Atlantic Monthly on the recent Facebook Controversy and a CACM article on uses of xMOOC data.

Sara M. Watson. Data Science: What the Facebook Controversy is Really About. The Atlantic. July 1, 2014. Available online at http://www.theatlantic.com/technology/archive/2014/07/data-science-what-the-facebook-controversy-is-really-about/373770/>.

Facebook has always “manipulated” the results shown in its users’ News Feeds by filtering and personalizing for relevance. But this weekend, the social giant seemed to cross a line, when it announced that it engineered emotional responses two years ago in an “emotional contagion” experiment, published in the Proceedings of the National Academy of Sciences (PNAS).

Since then, critics have examined many facets of the experiment, including itsdesign, methodology, approval process, and ethics. Each of these tacks tacitly accepts something important, though: the validity of Facebook’s science and scholarship. There is a more fundamental question in all this: What does it mean when we call proprietary data research data science?

As a society, we haven't fully established how we ought to think about data science in practice. It's time to start hashing that out.

Jon P. Daries, Justin Reich, Jim Waldo, Elise M. Young, Jonathan Whittinghill, Andrew Dean Ho, Daniel Thomas Seaton, and Isaac Chuang. 2014. Privacy, anonymity, and big data in the social sciences. Commun. ACM 57, 9 (September 2014), 56-63. DOI=10.1145/2643132 http://doi.acm.org/10.1145/2643132.

Open data has tremendous potential for science, but, in human subjects research, there is a tension between privacy and releasing high-quality open data. Federal law governing student privacy and the release of student records suggests that anonymizing student data protects student privacy. Guided by this standard, we de-identified and released a data set from 16 MOOCs (massive open online courses) from MITx and HarvardX on the edX platform. In this article, we show that these and other de-identification procedures necessitate changes to data sets that threaten replication and extension of baseline analyses. To balance student privacy and the benefits of open data, we suggest focusing on protecting privacy without anonymizing data by instead expanding policies that compel researchers to uphold the privacy of the subjects in open data sets. If we want to have high-quality social science research and also protect the privacy of human subjects, we must eventually have trust in researchers. Otherwise, we'll always have the strict tradeoff between anonymity and science illustrated here.

Printed copies of the readings are available next to Science 3821.

Computer science table is a weekly meeting of Grinnell College community members (students, faculty, staff, etc.) interested in discussing topics related to computing and computer science. CS Table meets Fridays from 12:10-12:50 in the Day PDR (JRC 224A). Contact Sam Rebelsky rebelsky@grinnell.edu for the weekly reading. Students on meal plans, faculty, and staff are expected to cover the cost of their meals. Students not on meal plans can charge their meals to the department.

Syndicate content