Read by QxMD icon Read

Big Data

Steve Huckle, Martin White
In this article, we introduce a prototype of an innovative technology for proving the origins of captured digital media. In an era of fake news, when someone shows us a video or picture of some event, how can we trust its authenticity? It seems that the public no longer believe that traditional media is a reliable reference of fact, perhaps due, in part, to the onset of many diverse sources of conflicting information, via social media. Indeed, the issue of "fake" reached a crescendo during the 2016 U...
December 2017: Big Data
Denis Stukal, Sergey Sanovich, Richard Bonneau, Joshua A Tucker
Automated and semiautomated Twitter accounts, bots, have recently gained significant public attention due to their potential interference in the political realm. In this study, we develop a methodology for detecting bots on Twitter using an ensemble of classifiers and apply it to study bot activity within political discussions in the Russian Twittersphere. We focus on the interval from February 2014 to December 2015, an especially consequential period in Russian politics. Among accounts actively Tweeting about Russian politics, we find that on the majority of days, the proportion of Tweets produced by bots exceeds 50%...
December 2017: Big Data
Gillian Bolsover, Philip Howard
No abstract text is available yet for this article.
December 2017: Big Data
Aastha Nigam, Henry K Dambanemuya, Madhav Joshi, Nitesh V Chawla
Peace processes are complex, protracted, and contentious involving significant bargaining and compromising among various societal and political stakeholders. In civil war terminations, it is pertinent to measure the pulse of the nation to ensure that the peace process is responsive to citizens' concerns. Social media yields tremendous power as a tool for dialogue, debate, organization, and mobilization, thereby adding more complexity to the peace process. Using Colombia's final peace agreement and national referendum as a case study, we investigate the influence of two important indicators: intergroup polarization and public sentiment toward the peace process...
December 2017: Big Data
Christian Grimme, Mike Preuss, Lena Adam, Heike Trautmann
Social bots are currently regarded an influential but also somewhat mysterious factor in public discourse and opinion making. They are considered to be capable of massively distributing propaganda in social and online media, and their application is even suspected to be partly responsible for recent election results. Astonishingly, the term social bot is not well defined and different scientific disciplines use divergent definitions. This work starts with a balanced definition attempt, before providing an overview of how social bots actually work (taking the example of Twitter) and what their current technical limitations are...
December 2017: Big Data
David Sathiaraj, William M Cassidy, Eric Rohli
The problem of accurately predicting vote counts in elections is considered in this article. Typically, small-sample polls are used to estimate or predict election outcomes. In this study, a machine-learning hybrid approach is proposed. This approach utilizes multiple sets of static data sources, such as voter registration data, and dynamic data sources, such as polls and donor data, to develop individualized voter scores for each member of the population. These voter scores are used to estimate expected vote counts under different turnout scenarios...
December 2017: Big Data
Vasant Dhar
No abstract text is available yet for this article.
December 2017: Big Data
Fabian Schäfer, Stefan Evert, Philipp Heinrich
In this article, we present results on the identification and behavioral analysis of social bots in a sample of 542,584 Tweets, collected before and after Japan's 2014 general election. Typical forms of bot activity include massive Retweeting and repeated posting of (nearly) the same message, sometimes used in combination. We focus on the second method and present (1) a case study on several patterns of bot activity, (2) methodological considerations on the automatic identification of such patterns and the prerequisite near-duplicate detection, and (3) we give qualitative insights into the purposes behind the usage of social/political bots...
December 2017: Big Data
Kush R Varshney, Homa Alemzadeh
Machine learning algorithms increasingly influence our decisions and interact with us in all parts of our daily lives. Therefore, just as we consider the safety of power plants, highways, and a variety of other engineered socio-technical systems, we must also take into account the safety of systems involving machine learning. Heretofore, the definition of safety has not been formalized in a machine learning context. In this article, we do so by defining machine learning safety in terms of risk, epistemic uncertainty, and the harm incurred by unwanted outcomes...
September 2017: Big Data
Maria T Patterson, Robert L Grossman
We introduce a method called neighbor-based bootstrapping (NB2) that can be used to quantify the geospatial variation of a variable. We applied this method to an analysis of the incidence rates of disease from electronic medical record data (International Classification of Diseases, Ninth Revision codes) for ∼100 million individuals in the United States over a period of 8 years. We considered the incidence rate of disease in each county and its geospatially contiguous neighbors and rank ordered diseases in terms of their degree of geospatial variation as quantified by the NB2 method...
September 2017: Big Data
(no author information available yet)
No abstract text is available yet for this article.
September 2017: Big Data
Manu Shukla, Raimundo Dos Santos, Feng Chen, Chang-Tien Lu
Storytelling connects entities (people, organizations) using their observed relationships to establish meaningful storylines. This can be extended to spatiotemporal storytelling that incorporates locations, time, and graph computations to enhance coherence and meaning. But when performed sequentially these computations become a bottleneck because the massive number of entities make space and time complexity untenable. This article presents DISCRN, or distributed spatiotemporal ConceptSearch-based storytelling, a distributed framework for performing spatiotemporal storytelling...
September 2017: Big Data
Vasant Dhar
No abstract text is available yet for this article.
September 2017: Big Data
Daizhuo Chen, Samuel P Fraiberger, Robert Moakler, Foster Provost
Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions...
September 2017: Big Data
Apostolos Filippas, Theodoros Lappas
Sentiment classification, the task of assigning a positive or negative label to a text segment, is a key component of mainstream applications such as reputation monitoring, sentiment summarization, and item recommendation. Even though the performance of sentiment classification methods has steadily improved over time, their ever-increasing complexity renders them comprehensible by only a shrinking minority of expert practitioners. For all others, such highly complex methods are black-box predictors that are hard to tune and even harder to justify to decision makers...
September 2017: Big Data
Ravi Shroff
Many municipal agencies maintain detailed and comprehensive electronic records of their interactions with citizens. These data, in combination with machine learning and statistical techniques, offer the promise of better decision making, and more efficient and equitable service delivery. However, a data scientist employed by an agency to implement these techniques faces numerous and varied choices that cumulatively can have significant real-world consequences. The data scientist, who may be the only person at an agency equipped to understand the technical complexity of a predictive algorithm, therefore, bears a good deal of responsibility in making judgments...
September 2017: Big Data
Lewis Alexander, Sanjiv R Das, Zachary Ives, H V Jagadish, Claire Monteleoni
Significant research challenges must be addressed in the cleaning, transformation, integration, modeling, and analytics of Big Data sources for finance. This article surveys the progress made so far in this direction and obstacles yet to be overcome. These are issues that are of interest to data-driven financial institutions in both corporate finance and consumer finance. These challenges are also of interest to the legal profession as well as to regulators. The discussion is relevant to technology firms that support the growing field of FinTech...
September 2017: Big Data
Gina Neff, Anissa Tanweer, Brittany Fiore-Gartland, Laura Osburn
What would data science look like if its key critics were engaged to help improve it, and how might critiques of data science improve with an approach that considers the day-to-day practices of data science? This article argues for scholars to bridge the conversations that seek to critique data science and those that seek to advance data science practice to identify and create the social and organizational arrangements necessary for a more ethical data science. We summarize four critiques that are commonly made in critical data studies: data are inherently interpretive, data are inextricable from context, data are mediated through the sociomaterial arrangements that produce them, and data serve as a medium for the negotiation and communication of values...
June 2017: Big Data
Elana Zeide
Educators and commenters who evaluate big data-driven learning environments focus on specific questions: whether automated education platforms improve learning outcomes, invade student privacy, and promote equality. This article puts aside separate unresolved-and perhaps unresolvable-issues regarding the concrete effects of specific technologies. It instead examines how big data-driven tools alter the structure of schools' pedagogical decision-making, and, in doing so, change fundamental aspects of America's education enterprise...
June 2017: Big Data
Marina Drosou, H V Jagadish, Evaggelia Pitoura, Julia Stoyanovich
Big data technology offers unprecedented opportunities to society as a whole and also to its individual members. At the same time, this technology poses significant risks to those it overlooks. In this article, we give an overview of recent technical work on diversity, particularly in selection tasks, discuss connections between diversity and fairness, and identify promising directions for future work that will position diversity as an important component of a data-responsible society. We argue that diversity should come to the forefront of our discourse, for reasons that are both ethical-to mitigate the risks of exclusion-and utilitarian, to enable more powerful, accurate, and engaging data analysis and use...
June 2017: Big Data
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"