journal
MENU ▼
Read by QxMD icon Read
search

Big Data

journal
https://www.readbyqxmd.com/read/29570416/fraudbuster-reducing-fraud-in-an-auto-insurance-market
#1
Saurabh Nagrecha, Reid A Johnson, Nitesh V Chawla
Nonstandard insurers suffer from a peculiar variant of fraud wherein an overwhelming majority of claims have the semblance of fraud. We show that state-of-the-art fraud detection performs poorly when deployed at underwriting. Our proposed framework "FraudBuster" represents a new paradigm in predicting segments of fraud at underwriting in an interpretable and regulation compliant manner. We show that the most actionable and generalizable profile of fraud is represented by market segments with high confidence of fraud and high loss ratio...
March 2018: Big Data
https://www.readbyqxmd.com/read/29570415/a-literature-survey-and-experimental-evaluation-of-the-state-of-the-art-in-uplift-modeling-a-stepping-stone-toward-the-development-of-prescriptive-analytics
#2
Floris Devriendt, Darie Moldovan, Wouter Verbeke
Prescriptive analytics extends on predictive analytics by allowing to estimate an outcome in function of control variables, allowing as such to establish the required level of control variables for realizing a desired outcome. Uplift modeling is at the heart of prescriptive analytics and aims at estimating the net difference in an outcome resulting from a specific action or treatment that is applied. In this article, a structured and detailed literature survey on uplift modeling is provided by identifying and contrasting various groups of approaches...
March 2018: Big Data
https://www.readbyqxmd.com/read/29570414/geospatial-analytics-in-retail-site-selection-and-sales-prediction
#3
Choo-Yee Ting, Chiung Ching Ho, Hui Jia Yee, Wan Razali Matsah
Studies have shown that certain features from geography, demography, trade area, and environment can play a vital role in retail site selection, largely due to the impact they asserted on retail performance. Although the relevant features could be elicited by domain experts, determining the optimal feature set can be intractable and labor-intensive exercise. The challenges center around (1) how to determine features that are important to a particular retail business and (2) how to estimate retail sales performance given a new location? The challenges become apparent when the features vary across time...
March 2018: Big Data
https://www.readbyqxmd.com/read/29570413/special-issue-on-profit-driven-analytics
#4
Bart Baesens, Wouter Verbeke, Cristián Bravo
No abstract text is available yet for this article.
March 2018: Big Data
https://www.readbyqxmd.com/read/29570412/profit-based-model-selection-for-customer-retention-using-individual-customer-lifetime-values
#5
María Óskarsdóttir, Bart Baesens, Jan Vanthienen
The goal of customer retention campaigns, by design, is to add value and enhance the operational efficiency of businesses. For organizations that strive to retain their customers in saturated, and sometimes fast moving, markets such as the telecommunication and banking industries, implementing customer churn prediction models that perform well and in accordance with the business goals is vital. The expected maximum profit (EMP) measure is tailored toward this problem by taking into account the costs and benefits of a retention campaign and estimating its worth for the organization...
March 2018: Big Data
https://www.readbyqxmd.com/read/29235919/fake-news-a-technological-approach-to-proving-the-origins-of-content-using-blockchains
#6
Steve Huckle, Martin White
In this article, we introduce a prototype of an innovative technology for proving the origins of captured digital media. In an era of fake news, when someone shows us a video or picture of some event, how can we trust its authenticity? It seems that the public no longer believe that traditional media is a reliable reference of fact, perhaps due, in part, to the onset of many diverse sources of conflicting information, via social media. Indeed, the issue of "fake" reached a crescendo during the 2016 U...
December 2017: Big Data
https://www.readbyqxmd.com/read/29235918/detecting-bots-on-russian-political-twitter
#7
Denis Stukal, Sergey Sanovich, Richard Bonneau, Joshua A Tucker
Automated and semiautomated Twitter accounts, bots, have recently gained significant public attention due to their potential interference in the political realm. In this study, we develop a methodology for detecting bots on Twitter using an ensemble of classifiers and apply it to study bot activity within political discussions in the Russian Twittersphere. We focus on the interval from February 2014 to December 2015, an especially consequential period in Russian politics. Among accounts actively Tweeting about Russian politics, we find that on the majority of days, the proportion of Tweets produced by bots exceeds 50%...
December 2017: Big Data
https://www.readbyqxmd.com/read/29235917/computational-propaganda-and-political-big-data-moving-toward-a-more-critical-research-agenda
#8
Gillian Bolsover, Philip Howard
No abstract text is available yet for this article.
December 2017: Big Data
https://www.readbyqxmd.com/read/29235916/harvesting-social-signals-to-inform-peace-processes-implementation-and-monitoring
#9
Aastha Nigam, Henry K Dambanemuya, Madhav Joshi, Nitesh V Chawla
Peace processes are complex, protracted, and contentious involving significant bargaining and compromising among various societal and political stakeholders. In civil war terminations, it is pertinent to measure the pulse of the nation to ensure that the peace process is responsive to citizens' concerns. Social media yields tremendous power as a tool for dialogue, debate, organization, and mobilization, thereby adding more complexity to the peace process. Using Colombia's final peace agreement and national referendum as a case study, we investigate the influence of two important indicators: intergroup polarization and public sentiment toward the peace process...
December 2017: Big Data
https://www.readbyqxmd.com/read/29235915/social-bots-human-like-by-means-of-human-control
#10
Christian Grimme, Mike Preuss, Lena Adam, Heike Trautmann
Social bots are currently regarded an influential but also somewhat mysterious factor in public discourse and opinion making. They are considered to be capable of massively distributing propaganda in social and online media, and their application is even suspected to be partly responsible for recent election results. Astonishingly, the term social bot is not well defined and different scientific disciplines use divergent definitions. This work starts with a balanced definition attempt, before providing an overview of how social bots actually work (taking the example of Twitter) and what their current technical limitations are...
December 2017: Big Data
https://www.readbyqxmd.com/read/29235914/improving-predictive-accuracy-in-elections
#11
David Sathiaraj, William M Cassidy, Eric Rohli
The problem of accurately predicting vote counts in elections is considered in this article. Typically, small-sample polls are used to estimate or predict election outcomes. In this study, a machine-learning hybrid approach is proposed. This approach utilizes multiple sets of static data sources, such as voter registration data, and dynamic data sources, such as polls and donor data, to develop individualized voter scores for each member of the population. These voter scores are used to estimate expected vote counts under different turnout scenarios...
December 2017: Big Data
https://www.readbyqxmd.com/read/29235913/should-we-regulate-digital-platforms
#12
Vasant Dhar
No abstract text is available yet for this article.
December 2017: Big Data
https://www.readbyqxmd.com/read/29182493/japan-s-2014-general-election-political-bots-right-wing-internet-activism-and-prime-minister-shinz%C3%A5-abe-s-hidden-nationalist-agenda
#13
Fabian Schäfer, Stefan Evert, Philipp Heinrich
In this article, we present results on the identification and behavioral analysis of social bots in a sample of 542,584 Tweets, collected before and after Japan's 2014 general election. Typical forms of bot activity include massive Retweeting and repeated posting of (nearly) the same message, sometimes used in combination. We focus on the second method and present (1) a case study on several patterns of bot activity, (2) methodological considerations on the automatic identification of such patterns and the prerequisite near-duplicate detection, and (3) we give qualitative insights into the purposes behind the usage of social/political bots...
December 2017: Big Data
https://www.readbyqxmd.com/read/28933947/on-the-safety-of-machine-learning-cyber-physical-systems-decision-sciences-and-data-products
#14
Kush R Varshney, Homa Alemzadeh
Machine learning algorithms increasingly influence our decisions and interact with us in all parts of our daily lives. Therefore, just as we consider the safety of power plants, highways, and a variety of other engineered socio-technical systems, we must also take into account the safety of systems involving machine learning. Heretofore, the definition of safety has not been formalized in a machine learning context. In this article, we do so by defining machine learning safety in terms of risk, epistemic uncertainty, and the harm incurred by unwanted outcomes...
September 2017: Big Data
https://www.readbyqxmd.com/read/28933946/detecting-spatial-patterns-of-disease-in-large-collections-of-electronic-medical-records-using-neighbor-based-bootstrapping
#15
Maria T Patterson, Robert L Grossman
We introduce a method called neighbor-based bootstrapping (NB2) that can be used to quantify the geospatial variation of a variable. We applied this method to an analysis of the incidence rates of disease from electronic medical record data (International Classification of Diseases, Ninth Revision codes) for ∼100 million individuals in the United States over a period of 8 years. We considered the incidence rate of disease in each county and its geospatially contiguous neighbors and rank ordered diseases in terms of their degree of geospatial variation as quantified by the NB2 method...
September 2017: Big Data
https://www.readbyqxmd.com/read/28933945/a-message-from-the-editor-in-chief-of-big-data
#16
(no author information available yet)
No abstract text is available yet for this article.
September 2017: Big Data
https://www.readbyqxmd.com/read/28933944/discrn-a-distributed-storytelling-framework-for-intelligence-analysis
#17
Manu Shukla, Raimundo Dos Santos, Feng Chen, Chang-Tien Lu
Storytelling connects entities (people, organizations) using their observed relationships to establish meaningful storylines. This can be extended to spatiotemporal storytelling that incorporates locations, time, and graph computations to enhance coherence and meaning. But when performed sequentially these computations become a bottleneck because the massive number of entities make space and time complexity untenable. This article presents DISCRN, or distributed spatiotemporal ConceptSearch-based storytelling, a distributed framework for performing spatiotemporal storytelling...
September 2017: Big Data
https://www.readbyqxmd.com/read/28933943/what-is-the-role-of-artificial-intelligence-in-sports
#18
Vasant Dhar
No abstract text is available yet for this article.
September 2017: Big Data
https://www.readbyqxmd.com/read/28933942/enhancing-transparency-and-control-when-drawing-data-driven-inferences-about-individuals
#19
Daizhuo Chen, Samuel P Fraiberger, Robert Moakler, Foster Provost
Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions...
September 2017: Big Data
https://www.readbyqxmd.com/read/28933941/strength-in-numbers-using-big-data-to-simplify-sentiment-classification
#20
Apostolos Filippas, Theodoros Lappas
Sentiment classification, the task of assigning a positive or negative label to a text segment, is a key component of mainstream applications such as reputation monitoring, sentiment summarization, and item recommendation. Even though the performance of sentiment classification methods has steadily improved over time, their ever-increasing complexity renders them comprehensible by only a shrinking minority of expert practitioners. For all others, such highly complex methods are black-box predictors that are hard to tune and even harder to justify to decision makers...
September 2017: Big Data
journal
journal
48893
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"