The Contours of Crowd Capability
John Prpić Beedie School of Business, SFU firstname.lastname@example.org
Prashant Shukla Beedie School of Business, SFU email@example.com
In this work we use the theory of Crowd Capital as a lens to compare and contrast a number of IS tools currently in use by organizations for crowd-engagement purposes. In doing so, we contribute to both the practitioner and research domains. For the practitioner community we provide decision-makers with a convenient and useful resource, in table-form, outlining in detail some of the differing potentialities of crowdengaging IS. For the research community we begin to unpack some of the key properties of crowd-engaging IS, including some of the differing qualities of the crowds that these IS application engage. In this vein, Crowdsourcing [5, 6] is being widely studied in numerous contexts, and the knowledge generated from these IS phenomena is well-documented [2, 13, 30]. At the same time, other organizations are leveraging dispersed knowledge by putting in place IS-applications such as Predication Markets  to gather large sample-size forecasts from within and without the organization. Similarly, we are also observing many organizations using IS-tools such as “Wikis”  to access the knowledge of dispersed populations within the boundaries of the organization. Further still, other organizations are applying gamification techniques [10, 24] to accumulate Citizen Science  knowledge from the public at large through IS. Among these seemingly disparate phenomena, a complex ecology of crowdengaging IS has emerged, involving millions of people all around the world generating knowledge for organizations through IS. However, despite the obvious scale and reach of this emerging crowd-engagement paradigm, there are no examples of research (as far as we know), that systematically compares and contrasts a large variety of these existing crowd-engaging IS-tools in one work. Understanding this current state of affairs, we seek to address this significant research void by comparing and contrasting a
The existence of dispersed knowledge has been a subject of inquiry for more than six decades . Despite the longevity of this rich research tradition, the “knowledge problem” has remained largely unresolved both in research and practice, and remains “the central theoretical problem of all social science” . However, in the 21st century, organizations are presented with opportunities through technology to potentially benefit from the dispersed knowledge problem to some extent. One such opportunity is represented by the recent emergence of a variety of crowd-engaging information systems (IS).
number of the crowd-engaging forms of IS currently available for organizational use. To achieve this goal, we employ the Theory of Crowd Capital  as a lens to systematically structure our investigation of crowd-engaging IS. Employing this parsimonious lens, we first explain how Crowd Capital is generated through Crowd Capability in organizations. Taking this conceptual platform as a point of departure, in Section 3, we offer an array of examples of IS currently in use in modern practice to generate Crowd Capital. We compare and contrast these emerging IS techniques using the Crowd Capability construct, therein highlighting some important choices that organizations face when entering the crowdengagement fray. This comparison, which we term “The Contours of Crowd Capability”, can be used by decision-makers and researchers alike, to differentiate among the many extant methods of Crowd Capital generation. At the same time, our comparison also illustrates some important differences to be found in the internal organizational processes that accompany each form of crowd-engaging IS. In section 4, we conclude with a discussion of the limitations of our work.
decreased R&D costs . Using these perspectives, Prpić and Shukla  bound and explain the dynamics and mechanisms that enable organizations to engage crowds through IS, and in doing so, supply a coherent and parsimonious model explaining how and why organizations engage in these disparate knowledge sources. The result is the Theory of Crowd Capital (see Figure #1 below adapted from ). Figure #1- The Theory of Crowd Capital
Figure #1 – The dispersed knowledge of individuals is engaged and processed by the Crowd Capability of an organization, generating a heterogeneous Crowd Capital resource.
2. Theoretical Background
From both the resource based view  and the knowledge based view of the organization [26, 27], unique knowledge is viewed as a valuable commodity for organizations, potentially endowing organizations with an advantage over their competitors. Furthermore, more recently, Innovation scholars have reasoned that organizations should give equal importance to internal and external knowledge sources for their R&D activities , while others have argued that the utilization of external knowledge gives organizations a competitive edge through
The Theory of Crowd Capital suggests that a new form of heterogeneous knowledge resource is available to organizations that use IS to engage a crowd. The authors conceptualize that Crowd Capital is an organizational-level knowledge resource generated by an organization’s Crowd Capability. In turn, Crowd Capability is an organizational-level capability, defined by the structure, content, and process of an organizations engagement with the dispersed knowledge of individuals—a Crowd [12, 22]. The structure component of Crowd Capability is always an IS-mediated phenomenon and denotes the technological means employed
by an organization to engage a crowd population. The content dimension of Crowd Capability constitutes the knowledge, information or data that an organization seeks from a crowd population. Whereas the process dimension of Crowd Capability defines the internal procedures that an organization will use to organize, filter, and integrate the incoming knowledge, information, or data. Further, Prpić & Shukla  also delineate that the structure dimension of the Crowd Capability construct can be found to function in episodic or continuing forms, depending on the design of the IS used to engage a crowd. For example, Google’s ReCaptcha, the Iowa Electronic Prediction market or Foldit; illustrate the episodic nature of Crowd Capability structure, where no community, collaboration, interaction or relationships among the participants is needed through the IS, for Crowd Capital to be generated. On the other hand, peer production  cocreation  and innovation communities  underscore the importance of social capital in efforts to engage an IS-mediated crowd. These efforts are continuing in nature, as there is interaction, community, collaboration and relationships among the participants using the IS to generate knowledge for the organization. In the ensuing section of this work, we will use this theoretical perspective to compare and contrast more than a dozen different IS tools currently in use for crowd-engagement. 3. The Contours of Crowd Capability In this section, we present numerous examples of IS currently in use by organizations to generate Crowd Capital. We discuss the different crowds that these forms of IS engage, and we further compare and
contrast these IS applications along the structure, content, and process dimensions of the Crowd Capability construct. Table #1 (see next page) summarizes the different “Contours of Crowd Capability” that we are observing in today’s business environment. We will discuss these differing dimensions in turn, below.
3.1 Differing Crowds
As is evident from Table #1, the different IS tools analyzed here are designed to engage demonstrably different populations of participants. Some efforts, like those of ReCaptcha and Wikipedia, engage individuals from the public at large, where contributions can be made by anyone. Other forms of IS analyzed here, such as Crowdflower, M-Turk, and Hiretheworld also engage individuals from the public at large, though these applications curate the individuals who participate. Curation  occurs when the individuals participating are “vetted” in one way or another, and such curation is often actuated through symbols or information in the IS directly associated with an individual’s screen name and/or profile. This “vetting” can occur as a result of historical performance measures (such as leaderboards), through techniques such as peer-evaluation, the award of badges for certain services rendered, or by the mutual assessment of participants . Due to the accrual (or not) of these symbols associated with a user in the IS, curation provides signals [16, 25] relative to the other participants. Although curation techniques have also been used specifically for content purposes in other settings [18, 17], for the purposes of our investigation we focus only on the curation of individual participants.
Table #1- The Contours of Crowd Capability
IS currently in use for Crowd1 Engagement Google’s ReCAPTCHA Nature of the Crowd Engaged Crowd Capability Structure Crowd Capability - Content Crowd Capability - Process
Structure: Web application structured in Episodic form.
Content: Text digitization and Spam reduction.
Process: The application aggregates the snippets of digital text inputted by Individuals into fully digitized works.
Structure: Web-based platform structured in Episodic form for contributors, and Continuing form for Editors. .
Content: Encyclopedia entries. Process: Edits contributed by individuals are monitored and approved by a community of editors.
Innocentive M-Turk Crowdflower Hiretheworld Kaggle 99Designs MobileWorks
Public Crowd – Curated
Structure: Web-based platforms structured in Continuing form.
Content: Variety of content types available to be accessed by organizations, customizable to the idiosyncratic needs of the organization, including; Problem Solving, R&D, Idea generation, and Microtasks.
Process: Organizations using the services provided by these intermediaries must internally process the knowledge that they receive through their own means.
NapkinLabs Imaginatik DataStation Lumenogic
Structure: Software/Web applications structured in Continuing form.
Content: Customizable to an organization’s idiosyncratic needs.
Process: These applications provide a variety of features to assist the internal processing of incoming knowledge, including “Dashboards” for analysis. Process: In TagTrade, knowledge is filtered through employee participation with simulated market mechanisms. With Blue Shirt Nation, knowledge is codified by employees, acting as a repository for future access and dissemination.
BestBuy’s TagTrade & Blue Shirt Nation
Structure: Web applications structured in Continuing form for Blues Shirt Nation, and Episodic form for TagTrade.
Content: Market Research (TagTrade) & Internal Operations (Blue Shirt Nation)
See Appendix #1 for a corresponding list of URL’s
Furthermore, some forms of the IS investigated here engage what we term as “captive crowds” (such as in the case of Best Buy’s TagTrade & Blue Shirt Nation). In these examples, Best Buy leverages its own internal workforce as a captive crowd of dispersed knowledge. These crowds are captive in the sense that they are exclusive to the individuals working for the organization, though unless an organization takes further measures to institute relative signals among the participants in the IS, we do not consider such captive crowds to be automatically curated. Other forms of IS investigated here, such as NapkinLabs, DataStation, and Imaginatik, do not directly engage one specific form of a crowd or another. Rather, these applications are designed to work with whatever crowd an organization may already have engaged through other IS, such as social media properties like Facebook and Twitter. Overall, when making a decision of which type of Crowd Capability IS to buy, rent or develop, an organization needs to carefully consider the population of participants that it may already have access to, or the population that it would like to engage, realizing that not all forms of relevant IS are created equal in this regard.
process dimension outlines the internal processes that the organization will employ to filter, organize, and purpose the knowledge that is received from a crowd. In this regard, and as is evident from Table #1, there is a great deal of variety to be found across all three dimensions of Crowd Capability in practice today. The structure dimension of Crowd Capability is where an organization uses IS to engage a crowd. As we can see from Table #1, some organizations are using mobile software applications like MobileWorks , others are using software with the web (e.g. ReCaptcha), others still have created web-properties (Wikipedia), while others (like M-Turk) use web properties to offer services of crowd intermediation . Beyond these extant examples that we focus upon, there are other forms of IS, including, bots, sensor technologies, 3D printers etc. that could also potentially be productively employed (either independently or in combination) to engage a crowd, though for the purposes of this paper, we limit our analysis to the forms detailed above. Nonetheless, this current variety found in the structure dimension of Crowd Capability indicates that organizations have many different IS options available to engage dispersed knowledge.
3.2 The Crowd Structure
3.3 The Crowd Content
As we have learned thus far in this exposition, Crowd Capability includes three dimensions that need to be considered before an organization can engage dispersed knowledge. The structure dimension details the IS that the organization will use to engage a crowd, the content dimension details the specific type of knowledge that the organization seeks from a crowd, and the
The content dimension of Crowd Capability displays a wide variety of different knowledge needs/goals in our comparison in Table #1. We can see that some IS are used to generate knowledge in a literal sense, as in the case of Wikipedia and its encyclopedia entries. Others are targeted at generating ideas and creativity, as in the case of Hiretheworld and 99Designs. Further still, others like Kaggle are solving specific
problems for organizations; while endeavours like Innocentive are generating R&D. Intermediation web-properties like M-Turk and Crowdflower provide ready and willing labour for organizations to access, and said labour can perform a variety of tasks (perhaps any of those already mentioned), though thus far the individuals at such intermediation services are thought to excel at microtasks such as the translation of documents, labelling photos, and participating in surveys . From our perspective, Google’s ReCaptcha is a particularly interesting application of the Crowd Capability content dimension. Given that Google seeks to digitize books through the automation of microtasks, and said microtasks simultaneously serve a dual purpose -- to reduce spam and digitize books at the same time-- it appears that with ReCaptcha, Google serves to combine two Crowd Capability content tasks (spam reduction and text digitization) into one Crowd Capability IS structure, which from our perspective appears to be the first of its kind to achieve such a feat. Overall, we hope that you can see that organizations have a lot of existing options to choose from when considering the types of content that they desire to access from crowds.
essence be thought of as the “last mile” of Crowd Capital creation. As we see from Table #1, some existing forms of IS like M-Turk and Crowdflower, provide organizations with little or no support for the internal processing of incoming crowd knowledge. On the other hand, other forms of IS such as ReCaptcha involve some significant pre-processing work before the individual is engaged with the IS, but automate the processing of knowledge thereafter. In this case pre-processing is necessary in the sense that the bits of text that are undecipherable by OCR applications have to be transformed into digital images before they can be used in the ReCaptcha system. On the other hand, once this preprocessing is achieved, the ReCaptcha system automates the processing of the text inputted by individuals, into fully digitized works. Some IS applications analyzed here, such as NapkinLabs for example, in effect specialize in the process aspect of Crowd Capability. In this case, they offer customized solutions for the internal processing of incoming knowledge, through features of the application such as “Insight Dashboards” that are specifically designed to aid an organization’s analysis and use of incoming knowledge. Furthermore, some IS applications employ incentives such as pricing mechanisms, (as is common with Prediction Markets), to persuade individuals to process knowledge . In this vein, the example of Best Buy and its TagTrade process is a powerful exemplar. In our view, of particular interest along the process dimension of Crowd Capability is the example that Wikipedia provides. In the Wikipedia system a relatively small crowds of editors evaluate and approve the knowledge contributed by the larger public crowd. This fact is impressive, given that both sets of
3.4 The Crowd Process
In terms of the process dimension of Crowd Capability, here too we find a variety of approaches in Table #1. Because this dimension delineates the internal processes that an organization will institute to filter, organize, and purpose the knowledge gained from a crowd, the process dimension can in
Wikipedia crowds are volunteers, but perhaps more importantly, it signals the opportunity that organizations may benefit from using one crowd to process the knowledge generated from other distinct crowds. From our perspective, it would be really interesting to see if an organization could implement a Best Buy type of captive crowd to process the knowledge from a large public crowd, derived for example from M-Turk or Crowdflower. Alternatively, we feel the reverse may also be interesting, in that an organization could use an M-Turk type public-curated crowd to process and evaluate the knowledge from a Best Buy type of captive crowd. Whatever the case may be, in our view, it may be that the future of Crowd Capability lies in part in the intersection of crowds or of using multiple crowds in parallel. And finally in terms of the process dimension of Crowd Capability, it is important to note that other forms of IS not created for crowd-engagement, such as data mining  and business intelligence  applications, might also be fruitfully employed by the organization for the purposes of processing incoming crowd knowledge. Though we are as yet unaware of any extant situation employing such a configuration in the domain of crowd engagement, it may well be that such “…loosely coupled layers of digital technologies…”  have significant bearing on the processing of incoming crowd knowledge, and thus on Crowd Capital creation too.
As mentioned earlier in this work, an Episodic structure is a form of IS that does not need collaboration, cooperation, interaction or relationships among the engaged participants for the knowledge resource to be generated. Google’s ReCaptcha is a leading example of this type of structure, as the very many contributors (approximately 200 million ReCaptcha’s per day are typed by individuals, equalling about 500,000 hours of work per day2) never interact with one another. On the other hand, Continuing Crowd Capability structure uses collaboration and relationships among the participants, in one form or another, to generate knowledge resources. A good exemplar in this realm is Best Buy’s Blue Shirt Nation, which relies on a volunteer community of internal employees to codify and exchange knowledge. Once more, we find that Wikipedia is an interesting example along this dichotomy too, in that it implements both forms of Crowd Capability structure simultaneously. Wikipedia implements an Episodic structure by allowing anybody to contribute to the encyclopedia in a “one-off” manner, yet, at the same time, Wikipedia employs a Continuing structure too, through the community of editors that monitor and approve the episodic contributions. In terms of the economics of this Crowd Capability structure dichotomy, we feel that the different manifestations of said structure presented here would similarly entail very different economic repercussions. For support in this regard, we point to the simple fact that some Episodic structures like ReCaptcha “automate” the processing of incoming crowd knowledge, whereas most, if not all continuing structures, require human processing of the incoming knowledge. We would thus expect that such structural
3.5 Episodic Vs. Continuing Structure
The final characteristic upon which we differentiate the different forms of IS in Table #1, is a subset of the Crowd Capability structure dimension. Here, we draw upon the important distinction made by Prpić & Shukla , of Episodic vs. Continuing Crowd Capability structure.
differences, encoded in the IS, would have a major bearing on the costs associated with each kind of structure. Overall, it is very important for organizations to understand the EpisodicContinuing didactic of Crowd Capability structure, as most practitioners and researchers falsely assume that community, collaboration, interactions, and relationships within a crowd are necessary to generate knowledge therein. As we have illustrated, Continuing Crowd Capability structure is but one of the possible options available to organizations considering to enter the crowd fray.
crowds. Further, we find that some public crowds can be found to occur in curated forms, where individual crowd participants are vetted to some degree through the IS. Like all research, our work here has limitations. Our investigation is very far from exhaustive, in the sense that we only consider fourteen different forms of IS currently in use for crowd-engagement. Most likely there are forms of crowd-engaging IS of which we are currently unaware, which may endow different affordances  than what we address here. Further, our analysis method is ex post, and although we urge you to investigate the IS applications that we focus upon (see Appendix #1) we make absolutely no claims to the correctness of our analysis. Rather it is our hope that our work here is a decent starting point in unpacking this swiftly emerging domain. Further still, due to the nascent and continuously emerging nature of our subject matter, there is not a rich or deep literature for us to draw-upon to ground our claims. Although wherever possible we employ the extant literature faithfully, the lack of extant research signals the need for caution in accepting our results. Despite these limitations, we believe that our work is indeed a useful starting point in unpacking this domain. We systematically use the Theory of Crowd Capital to bound and limit our investigation, and in doing so, drawout some useful distinctions in the crowdengaging IS domain. In doing so, we raise interesting questions for future research, and we look forward to future research that investigates the relative efficacy of Episodic structures versus Continuing forms. Further, we begin to address the nature of the crowd that different forms of IS engage, and we believe that this will prove to be a rich vein of research, investigating the relative merits of curated crowds vs. non-curated crowds, the
4: Limitations & Conclusion
In this work we use the theory of Crowd Capital as a lens to compare and contrast more than a dozen emerging IS tools currently in use by organizations for crowdengagement purposes. We systematically employ the dimensions of the Crowd Capability construct to differentiate among these emerging forms of IS, and thus begin to outline the contours of Crowd Capability currently in use for crowd-engagement by organizations. In terms of IS structure, we find that organizations are using mobile software applications, web-based software applications, web-properties, and web properties offered as intermediation services. We further find that these IS applications can be considered to operate through either Episodic or Continuing structures, determined by the nature of the human participation required by the IS. In parallel, we also begin to explore the nature of the different crowds that organizations can engage through the aforementioned IS applications. We find that some forms of the IS analyzed here target captive crowds, while others engage public
intersection of captive and public crowds, and the use of multiple crowds in parallel. Similarly, we feel that our work is a very useful resource for the practitioner community, especially for those organizations who are considering crowd-engagement endeavours. Our work supplies decisionmakers in such organizations with a systematic starting point, highlighting some key decision issues and potentialities of the paradigm, on both a strategic and operations level.
 E. Hand, "Citizen science: People power". Nature (466:7307), 2010, pp. 685–687.  R. Hankins, and A, Lee, “Crowd sourcing and Prediction markets”, CHI 2011, May 7–12, 2011, Vancouver, BC, Canada.  F.A. Hayek, “The use of knowledge in society”, The American Economic Review, (35:4), 1945, pp. 519-530.  B.A. Huberman, “Crowdsourcing and attention”, Computer, (41:11), 2008, pp. 103-105.  A. Majchrzak, “Enabling customer-centricity using wikis and the wiki way”, Journal of Management Information Systems, (23:3), 2006, pp. 17-43.  R. Makadok, “Toward a synthesis of the resource based view and dynamic-capability: Views of rent creation”, Strategic Management Journal; (22:5), 2001, pp. 387–401.  J. Marlow and L. Dabbish. "Activity traces and signals in software developer recruitment and hiring." In Proceedings of the 2013 conference on Computer supported cooperative work, 2013, pp. 145-156.  A. Monroy-Hernández, E. Kiciman, M. De Choudhury, and S. Counts. "The new war correspondents: The rise of civic media curation in urban warfare." In Proceedings of the 2013 conference on Computer supported cooperative work, 2013, pp. 1443-1452.  A. Nagpal, S. Hangal, R.R. Joyee, and M.S. Lam. "Friends, romans, countrymen: lend me your URLs. using social chatter to personalize web search." In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, 2012, pp. 461-470.  P. Narula, P. Gutheim, D. Rolnitzky, A. Kulkarni, and B. Hartmann. "MobileWorks: A Mobile Crowdsourcing Platform for Workers at the Bottom of the Pyramid", In Proceedings of HCOMP, 2011.  C.K. Prahalad, and V. Ramaswamy, “Co-Creating unique value with customers” Strategy & Leadership, (32:3), 2004, pp. 4-9.  C.D. Prieur, J. Cardon, S. Beuscart, N. Pissard, and P. Pons, “The strength of weak cooperation: A case study on flickr”, 2008, pp. 610-613. Retrieved from http://arxiv.org/abs/0802.2317  J. Prpić, and P. Shukla, “The Theory of Crowd Capital”, th Proceedings of the 46 Annual Hawaii International Conference on System Sciences, Maui, Hawaii, January 710, Computer Society Press, 2013.
 A. Afuah, and C.L. Tucci, "Crowdsourcing as a solution to distant search", Academy of Management Review, (37:3), 2012, pp. 355-375.  P.J. Ågerfalk, “Outsourcing to an unknown workforce: Exploring open sourcing as a global strategy”, MIS Quarterly, (32:2), 2008, pp. 385-409.  J.B. Barney, “Firm resources and sustained competitive advantage”, Journal of Management, (17:1), 1991, pp. 99120.  Y. Benkler, and H. Nissenbaum, (2006), “Commons based peer production and virtue”. Journal of Political Philosophy, (14), pp. 394–419.  D.C. Brabham, “Crowdsourcing as a model for problem solving”, Convergence, (14:1), 2008, pp. 75-90.  D.C. Brabham, “Moving the crowd at threadless: Motivations for participation in a crowdsourcing application”, Information, Communication & Society, (13:8), 2010, pp. 1122.  Chesbrough, H.W, Open Innovation: The new imperative for creating and profiting from technology, Boston: Harvard Business School Press, 2003.  K. Crowston and N.R. Prestopnik. "Motivation and data quality in a citizen science game: A design science evaluation." Proceedings of Hawai’i International Conference on System Science, 2013.  U. Fayyad , G. P. Shapiro, and P. Smyth , From Data Mining to Knowledge Discovery in Databases , AI Magazine, Fall 1996b , pp. 37 – 53.
 D. Rigby, and C. Zook, “Open market innovation”, Harvard Business Review 80(10), 2002, pp. 80-89.  J. Silvertown, “A new dawn for citizen science”, Trends in Ecology & Evolution”, (24:9), 2009, pp. 467-471.  L. Singer, F.F. Filho, B. Cleary, C. Treude, M.A. Storey, and K. Schneider. "Mutual assessment in the social programmer ecosystem: An empirical investigation of developer profile aggregators." In Proceedings of the 2013 conference on Computer supported cooperative work, pp. 103-116. ACM, 2013.  J.C. Spender, “Making knowledge the basis of a dynamic theory of the firm”, Strategic Management Journal, (17:2), 1996, pp. 45-62.  J.C. Spender, and R. M. Grant, “Knowledge and the firm: Overview”, Strategic Management Journal, (17:2), 1996, pp. 5-9.  E. von Hippel, “Open source software projects as user innovation networks - no manufacturer required”, In Perspectives on Free and Open Source Software, edited by J. Feller, B. Fitzgerald, S. Hissam, and K. Lakhani. Cambridge: MIT Press, 2005.  H.J. Watson, and B.H. Wixom, "The Current State of Business Intelligence", Computer, (40:9), 2007, pp. 96-99.  F. Wu, “Crowdsourcing, attention and productivity”, Journal of Information Science, (35:6), 2009, pp. 758-765.  Y. Yoo, R.J. Boland, K. Lyytinen, and A. Majchrzak. "Organizing for innovation in the digitized world”, Organization Science, (23:5) 2012, pp. 1398-1408.
6. Appendix #1
99Designs Best Buy Crowdflower DataStation Hiretheworld Imaginatik Innocentive Kaggle Lumenogic MobileWorks M-Turk NapkinLabs ReCaptcha Wikipedia http://99designs.ca/
http://online.wsj.com/article/SB122152452811139909.htm http://crowdflower.com/ http://www.datastation.com/ https://www.hiretheworld.com/ http://www.imaginatik.com/ http://www.innocentive.com http://www.kaggle.com/ http://www.lumenogic.com/www/index.html https://www.mobileworks.com/ https://www.mturk.com/mturk/welcome http://napkinlabs.com/ http://www.google.com/recaptcha http://www.wikipedia.org/