Research Interest
My current research topic is statistical quality control on crowdsourcing. I'm also broadly interested in:
- Combining human brainpower and computer power, i.e., Human Computation and crowdsourcing [KDD2013][IAAI 2013][IJCAI 2013]
- Extracting knowledge from users' implicit behaviors on the Web, e.g., social tagging [ECAI 2010] and spelling errors [ACL 2012]
Keywords:
Data Mining, Crowdsourcing, Human Computation, Web Mining
Professional Experience
- Associate Professor, April 2018 to present
-
Faculty of Engineering, Information and Systems,
University of Tsukuba
- Assistant Professor, September 2015 to March 2018
-
Machine Learning and Data Mining Research Laboratory,
Department of Intelligence Science and Technology,
Graduate School of Informatics,
Kyoto University
- Program-Specific Assistant Professor, April 2015 to September 2015
-
Machine Learning and Data Mining Research Laboratory,
Department of Intelligence Science and Technology,
Graduate School of Informatics,
Kyoto University
- Project Research Associate, April 2014 to March 2015
- Global Research Center for Big Data Mathematics, National Institute of Informatics
- Project: JST, ERATO, Kawarabayashi Large Graph Project
- Project Researcher, June 2012 to March 2014
- Information-Theoretic Machine Learning and Data Mining Group, Dept. of Mathemacical Informatics, Graduate School of Information Science and Technology, The University of Tokyo
- Project: Development of the Fastest Database Engine for the Era of Very Large Database and Experiment and Evaluation of Strategic Social Services Enabled by the Database Engine, FIRST Program
- Project Researcher, April 2012 to May 2012
- National Institute of Informatics
- Research Intern, June 2011 to September 2011
- Microsoft Research, Redmond, USA
Mentor: Dr. Hisami Suzuki (Natural Language Processing Group)
- Research Intern, September 2010 to February 2011
- Microsoft Research Asia, Beijing, China
Mentor: Dr. Xian-Sheng HUA (Media Computing Group) and Dr. Lei Zhang (Web Search and Mining Group)
- Research Intern, August 2009 to October 2009
-
Fujitsu Laboratories of America, Sunnyvale, USA
Mentor: Dr. Alex Gilman
- Research Assistant, April 2007 to August 2010, April 2011 to June 2011, October 2011 to March 2012
- National Institute of Informatics
Education
- Ph.D. in Information Science and Technology, June 2012
-
School of Information Science and Technology, The University of Tokyo
Thesis title: Acquiring Word Denotations as Real-World Data from Social Tagging [dissertation (in Japanese)]
Supervisor: Prof. Shinichi Honiden
- Ph.D. candidate, from April 2009 to March 2012
-
School of Information Science and Technology, The University of Tokyo
Supervisor: Prof. Shinichi Honiden
- Master of Information Science and Technology, March 2009
- School of Information Science and Technology, The University of Tokyo
Thesis title: Extracting Spatial Concepts Labeled by Tags in Folksonomy
Supervisor: Prof. Shinichi Honiden
- Bachelor of Engineering, March 2007
- Electrical Engineering, Tokyo University of Science
Thesis title: Secret Sharing Scheme Suitable for Memory and Database
Supervisor: Prof. Keiichi Iwamura
Invited Talks and Tutorials
-
- Statistical Quality Control for Human Computation and Crowdsourcing
- Yukino Baba
- IJCAI 2018 Early Career Spotlight Talk
-
- Crowdsourcing for Big Data Analytics
- Hisashi Kashima, Satoshi Oyama, Yukino Baba
- PAKDD 2015 Tutorial
Academic Services
Conference Organizer
- Publicity/Social Co-chair, 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017)
- Program Committee, NAACL 2013 Student Research Workshop
Misc.
- Organizer, MLSS 2015 Predictive Modeling Challenge
Skills
- Languages: Japanese (Native), English (Advanced: TOEIC 915)
- Programming Languages: Python, Ruby, PHP, C, C++, Java, SQL
- Platforms: Mac OS X, Linux (Fedora)
- Web Designing: HTML, CSS, JavaScript
Datasets
-
Spelling-Correction Data
Collection of strings including before-and-after spelling-correction pairs in English and Japanese, derived automatically by processing keystroke logs collected through Amazon’s Mechanical Turk. See our paper for the details about how this data is generated.
Publications
Refereed journal articles
-
- Synthetic accessibility assessment using auxiliary responses
- Shun Ito, Yukino Baba, Hisashi Kashima
- Expert Systems with Applications (ESWA), Vol.145, 2020
-
- Wisdom of Crowds for Synthetic Accessibility Evaluation
- Yukino Baba, Tetsu Isomura, Hisashi Kashima
- Journal of Molecular Graphics and Modelling, Vol.80, pp.217-223, 2018
- [dataset]
-
- Supervised and Unsupervised Intrusion Detection Based on CAN Message Frequencies for In-Vehicle Network
- Takuya Kuwahara, Yukino Baba, Hisashi Kashima, Takeshi Kishikawa, Junichi Tsurumi, Tomoyuki Haga, Yoshihiro Ujiie, Takamitsu Sasaki, Hideki Matsushima
- Journal of Information Processing, Vol. 26, pp.306-313, 2018
-
- Crowdsourcing Chart Digitizer: Task Design and Quality Control for Making Legacy Open Data Machine-Readable
- Satoshi Oyama, Yukino Baba, Ikki Ohmukai, Hiroaki Dokoshi, Hisashi Kashima
- International Journal of Data Science and Analytics, Vol.2, No.1, pp.45–60, 2016
-
- Participation Recommendation System for Crowdsourcing Contests
- Yukino Baba, Kei Kinoshita, Hisashi Kashima
- Expert Systems with Applications (ESWA), Vol.58, pp.174-183, 2016
-
- Quality Control of Crowdsourced Classification Using Hierarchical Class Structures
- Naoki Otani, Yukino Baba, Hisashi Kashima
- Expert Systems with Applications (ESWA), Vol.58, pp.155–163, 2016
-
- A Health Checkup and Tele-Medical Intervention Program for Preventive Medicine in Developing Countries: A Verification Study
- Yasunobu Nohara, Eiko Kai, Partha Ghosh, Rafiqul Islam, Ashir Ahmed, Masahiro Kuroda, Sozo Inoue, Tatsuo Hiramatsu, Michio Kimura, Shuji Shimizu, Kunihisa Kobayashi, Yukino Baba, Hisashi Kashima, Koji Tsuda, Masashi Sugiyama, Mathieu Blondel, Naonori Ueda, Masaru Kitsuregawa, Naoki Nakashima
- Journal of Medical Internet Research (JMIR), Vol.17, No.1, p.e2, 2015
-
- Leveraging Non-Expert Crowdsourcing Workers for Improper Task Detection in Crowdsourcing Marketplaces
- Yukino Baba, Hisashi Kashima, Kei Kinoshita, Goushi Yamaguchi, Yosuke Akiyoshi
- Expert Systems with Applications (ESWA), Vol.41, No.6, pp.2678–2687, 2014
Refereed conference and workshop papers
-
- HumanGAN: Generative Adversarial Network with Human-based Discriminator and its Evaluation in Speech Perception Modeling
- Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari
- 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
-
- Active Learning Strategies for Hierarchical Labeling Microtasks
- Kousuke Uo, Masaki Kobayashi, Masaki Matsubara, Yukino Baba, and Atsuyuki Morishima
- 3rd IEEE Workshop on Human-in-the-loop Methods and Human Machine Collaboration in BigData, 2019
-
- Interdependence Model for Multi-label Classification
- Kosuke Yoshimura, Tomoaki Iwase, Yukino Baba, Hisashi Kashima
- 28th International Conference on Artificial Neural Networks (ICANN), 2019
-
- Large-scale Driver Identification Using Automobile Driving Data
- Daiki Tanaka, Yukino Baba, Hisashi Kashima, Yuta Okubo
- 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), 2019
-
- Probabilistic Modeling of Peer Correction and Peer Assessment
- Takeru Sunahase, Yukino Baba, Hisashi Kashima
- 12th International Conference on Educational Data Mining (EDM), 2019
-
- CrowNN: Human-in-the-loop Network with Crowd-generated Inputs
- Yusuke Sakata, Yukino Baba, Hisashi Kashima
- 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
- [poster]
-
- BayesGrad: Explaining Predictions of Graph Convolutional Networks
- Hirotaka Akita, Kosuke Nakago, Tomoki Komatsu, Yohei Sugawara, Shin-ichi Maeda, Yukino Baba, Hisashi Kashima
- 25th International Conference on Neural Information Processing (ICONIP), 2018
- [arXiv]
-
- Incorporating Worker Similarity for Label Aggregation in Crowdsourcing
- Jiyi Li, Yukino Baba, Hisashi Kashima
- 27th International Conference on Artificial Neural Networks (ICANN), 2018
-
- Statistical Quality Control for Human Computation and Crowdsourcing
- Yukino Baba
- 27th International Joint Conference on Artificial Intelligence (IJCAI), 2018
- [slide][speakerdeck]
-
- Payload-based Statistical Intrusion Detection for In-vehicle Networks
- Takuya Kuwahara, Yukino Baba, Hisashi Kashima, Takeshi Kishikawa, Junichi Tsurumi, Tomoyuki Haga, Yoshihiro Ujiie, Takamitsu Sasaki, and Hideki Matsushima
- Australasian Workshop on Machine Learning for Cyber-security (co-located with PAKDD 2018), 2018
-
- Simultaneous Clustering and Ranking from Pairwise Comparisons
- Jiyi Li, Yukino Baba, Hisashi Kashima
- 27th International Joint Conference on Artificial Intelligence (IJCAI), 2018
-
- Predictive Modeling of Learning Continuation in Preschool Education Using Temporal Patterns of Development Tests
- Junpei Naito, Yukino Baba, Hisashi Kashima, Takenori Takaki, Takuya Funo
- 8th Symposium on Educational Advances in Artificial Intelligence (EAAI), 2018
-
- Data Analysis Competition Platform for Educational Purposes: Lessons Learned and Future Challenges
- Yukino Baba, Tomoumi Takase, Kyohei Atarashi, Satoshi Oyama, Hisashi Kashima
- 8th Symposium on Educational Advances in Artificial Intelligence (EAAI), 2018
- [slide]
-
- AdaFlock: Adaptive Feature Discovery for Human-in-the-loop Predictive Modeling
- Ryusuke Takahama, Yukino Baba, Nobuyuki Shimizu, Sumio Fujita, Hisashi Kashima
- 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018
-
- Hyper Questions: Unsupervised Targeting of a Few Experts in Crowdsourcing
- Jiyi Li, Yukino Baba, Hisashi Kashima
- 26th ACM International Conference on Information and Knowledge Management (CIKM), 2017
- [dataset]
-
- Atomic Distance Kernel for Material Property Prediction
- Hirotaka Akita, Yukino Baba, Hisashi Kashima and Atsuto Seko
- 24th International Conference on Neural Information Processing (ICONIP), 2017
-
- Quality Control for Crowdsourced Multi-Label Classification using RAkEL
- Kosuke Yoshimura, Yukino Baba and Hisashi Kashima
- 24th International Conference on Neural Information Processing (ICONIP), 2017
-
- Distributed Multi-task Learning for Sensor Network
- Jiyi Li, Tomohiro Arai, Yukino Baba, Hisashi Kashima, Shotaro Miwa
- European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2017
-
- A Generalized Model for Multidimensional Intransitivity
- Jiuding Duan, Jiyi Li, Yukino Baba, Hisashi Kashima
- 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2017
-
- Pairwise HITS: Quality Estimation from Pairwise Comparisons in Creator-Evaluator Crowdsourcing Process
- Takeru Sunahase, Yukino Baba, Hisashi Kashima
- 31st AAAI Conference on Artificial Intelligence (AAAI), 2017
- [code&dataset]
-
- Predicting Fuel Consumption and Flight Delays for Low-cost Airlines
- Yuji Horiguchi, Yukino Baba, Hisashi Kashima, Masahito Suzuki, Hiroki Kayahara, Jun Maeno
- 29th Conference on Innovative Applications of Artificial Intelligence (IAAI), 2017
-
- Learning to Enumerate
- Patrick Jörger, Yukino Baba, Hisashi Kashima
- 25th International Conference on Artificial Neural Networks (ICANN), 2016
-
- Assessing Translation Ability through Vocabulary Ability Assessment
- Yo Ehara, Yukino Baba, Masao Utiyama, Eiichiro Sumita
- 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016
-
- Quality Control for Crowdsourced Hierarchical Classification
- Naoki Otani, Yukino Baba, Hisashi Kashima
- 2015 IEEE International Conference on Data Mining (ICDM), 2015
- [code&datasets]
-
- From One Star to Three Stars: Upgrading Legacy Open Data Using Crowdsourcing
- Satoshi Oyama, Yukino Baba, Ikki Ohmukai, Hiroaki Dokoshi, Hisashi Kashima
- 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015
-
- Predictive Approaches for Low-cost Preventive Medicine Program in Developing Countries
- Yukino Baba, Hisashi Kashima, Yasunobu Nohara, Eiko Kai, Partha Ghosh, Rafiqul Islam, Ashir Ahmed, Masahiro Kuroda, Sozo Inoue, Tatsuo Hiramatsu, Michio Kimura, Shuji Shimizu, Kunihisa Kobayashi, Koji Tsuda, Masashi Sugiyama, Mathieu Blondel, Naonori Ueda, Masaru Kitsuregawa, Naoki Nakashima
- 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) , 2015
- [slide][poster]
-
- Quality Control for Crowdsourced POI Collection
- Shunsuke Kajimura, Yukino Baba, Hiroshi Kajino, Hisashi Kashima
- 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2015
-
- Crowdsourced Data Analytics: A Case Study of Predictive Modeling Competition
- Yukino Baba, Nozomi Nori, Shigeru Saito, Hisashi Kashima
- 2014 International Conference on Data Science and Advanced Analytics (DSAA), 2014
- [slide]
-
- Instance-privacy Preserving Crowdsourcing
- Hiroshi Kajino, Yukino Baba, Hisashi Kashima
- 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2014
-
- Crowdordering
- Toshiko Matsui, Yukino Baba, Toshihiro Kamishima, Hisashi Kashima
- 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2014
-
- Skill Ontology-based Model for Quality Assurance in Crowdsourcing
- Kinda El Maarry, Wolf-Tilo Balke, Hyunsouk Cho, Seung-Won Hwang, Yukino Baba
- DASFAA Workshop on Uncertain and Crowdsourced Data (UnCrowd), 2014
-
- Statistical Quality Estimation for General Crowdsourcing Tasks
- Yukino Baba, Hisashi Kashima
- 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2013
- [slide][poster][dataset]
-
- Accurate Integration of Crowdsourced Labels Using Workers' Self-reported Confidence Scores
- Satoshi Oyama, Yukino Baba, Yuko Sakurai, Hisashi Kashima
- 23rd International Joint Conference on Artificial Intelligence (IJCAI), 2013
-
- Leveraging Crowdsourcing to Detect Improper Tasks in Crowdsourcing Marketplaces
- Yukino Baba, Hisashi Kashima, Kei Kinoshita, Goushi Yamaguchi, Yosuke Akiyoshi
- 25th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI), 2013
- [slide]
-
- How Are Spelling Errors Generated and Corrected? A Study of Corrected and Uncorrected Spelling Errors Using Keystroke Logs
- Yukino Baba, Hisami Suzuki
- 50th Annual Meeting of the Association for Computational Linguistics (ACL), 2012 (short paper)
- [poster][dataset]
-
- Extraction of Places Related to Flickr Tags
- Yukino Baba, Fuyuki Ishikawa, Shinichi Honiden
- 19th European Conference on Artificial Intelligence (ECAI), 2010
-
- Extracting Time and Location Concepts Related to Tags
- Yukino Baba, Fuyuki Ishikawa, Shinichi Honiden
- ISWC Workshop on Incentives for the Semantic Web (INSEMTIVE), 2008
Misc.
-
- Synthetic Accessibility Assessment Using Auxiliary Responses
- Shun Ito, Yukino Baba, Tetsu Isomura and Hisashi Kashima
- 6th AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2018
-
- Crowdsourcing Data Understanding: A Case Study using Open Government Data
- Yukino Baba, Hisashi Kashima
- 3rd AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2015
- [poster]
-
- Making Legacy Open Data Machine Readable by Crowdsourcing
- Satoshi Oyama, Yukino Baba, Ikki Ohmukai, Hiroaki Dokoshi, Hisashi Kashima
- 3rd AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2015
-
- Performance Evaluation between Crowdworkers and Biocurators towards Constructing a CrowdR&D Platform
- Eli Kaminuma, Yukino Baba, Takatomo Fujisawa, Asao Fujiyama, Hisashi Kashima and Yasukazu Nakamura
- 25th International Conference on Genome Informatics (GIW/ISCB-Asia), Poster Track, 2014
-
- Quality Control for Crowdsourced Enumeration Tasks
- Shunsuke Kajimura, Yukino Baba, Hiroshi Kajino, Hisashi Kashima
- 2nd AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2014
-
- Crowdsourced Data Analytics: A Case Study of a Predictive Modeling Competition
- Yukino Baba, Nozomi Nori, Shigeru Saito, Hisashi Kashima
- 2nd AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2014
- [poster]
-
- Statistical Quality Estimation for General Crowdsourcing Tasks
- Yukino Baba, Hisashi Kashima
- 1st AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2013
- [poster]
-
- Crowdsourcing Quality Control for Item Ordering Tasks
- Toshiko Matsui, Yukino Baba, Toshihiro Kamishima, Hisashi Kashima
- 1st AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2013
-
- EM-Based Inference of True Labels Using Confidence Judgments
- Satoshi Oyama, Yukino Baba, Yuko Sakurai, Hisashi Kashima
- 1st AAAI Conference on Human Computation & Crowdsourcing (HCOMP), Works-In-Progress, 2013
-
- Automatically Mapping Flickr Images to WordNet
- Yukino Baba, Shinichi Honiden
- 5th joint NII-LIP6 WorkShop on Multi-Agent and Distributed Systems, 2010
-
- Extracting Locations Related to Tags on Folksonomy
- Yukino Baba, Fuyuki Ishikawa, Shinichi Honiden
- 4th joint NII-LIP6 WorkShop on Multi-Agent and Distributed Systems, 2009
-
- Extracting and Utilizing Event-Context Relationships in Blogsphere
- Yukino Baba, Fuyuki Ishikawa, Shinichi Honiden
- 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference (ISWC+ASWC), 2007 (Poster/Demo Track)
- [poster]