Hongru Zhai joins the HASTE team to work on the developing hierarchical representation of the microscopy image data

 

Hongru is a master’s student from the department of statistics, and his main interests in statistics include multivariate statistical methods and Bayesian statistics.

Hongru’s MSc thesis will focus on developing the better hierarchical representations of the microscopy image data from cellular experiments with the help of statistical methods, focusing on improving readability and informational efficiency of the representation.

Tianru Zhang joins HASTE team as new PhD Student to work on management of large data streams

We welcome Tianru Zhang as new PhD Student in the Hellander lab at the Department of Information Technology, Uppsala University.

Tianru obtained his Bachelor in Probability and Statistics in Mathematics at the University of Science and Technology of China in 2017. Then, he completed his Master in Statistics for Smart Data at ENSAI (The National School of Statistics and Analysis of Information of France) in 2018. Before moving to Uppsala, he was employed as Assistant Researcher at the Fujitsu R&D center Co., Ltd. where he worked on developing DeepTensor (a deep learning method using tensor decomposition) and analyzing data of personal online loans.

Presentation at COPA 2019

Ola Spjuth, Co-PI in the HASTE project, presented two accepted HASTE-papers at the [8th Symposium on Conformal and Probabilistic Prediction with Applications](http://clrc.rhul.ac.uk/copa2019) in Varna, Bulgaria on 9-11 Sept 2019. The two papers below are now published in [Proceedings of Machine Learning Research (PMLR) volume 105](https://proceedings.mlr.press/v105/).

Paper 1: Split Knowledge Transfer in Learning Under Privileged Information Framework

Gauraha, N., Söderdahl, F. and Spjuth, O.
Split Knowledge Transfer in Learning Under Privileged Information Framework. 
Proceedings of Machine Learning Research (PMLR). 105, 43-52. (2019).
ABSTRACT
Learning Under Privileged Information (LUPI) enables the inclusion of additional (privileged) information when training machine learning models, data that is not available when making predictions. The methodology has been successfully applied to a diverse set of problems from various fields. SVM+ was the first realization of the LUPI paradigm which showed fast convergence but did not scale well. To address the scalability issue, knowledge transfer approaches were proposed to estimate privileged information from standard features in order to construct improved decision rules. Most available knowledge transfer methods use regression techniques and the same data for approximating the privileged features as for learning the transfer function. Inspired by the cross-validation approach, we propose to partition the training data into K folds and use each fold for learning a transfer function and the remaining folds for approximations of privileged features—we refer to this as split knowledge transfer. We evaluate the method using four different experimental setups comprising one synthetic and three real datasets. The results indicate that our approach leads to improved accuracy as compared to LUPI with standard knowledge transfer.

Paper 2: Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets

Spjuth O., Brännström R.C., Carlsson L. and Gauraha, N.
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets.
Proceedings of Machine Learning Research (PMLR). 105, 53-65. (2019).
ABSTRACT
Conformal Prediction is a framework that produces prediction intervals based on the output from a machine learning algorithm. In this paper, we explore the case when training data is made up of multiple parts available in different sources that cannot be pooled. We here consider the regression case and propose a method where a conformal predictor is trained on each data source independently, and where the prediction intervals are then combined into a single interval. We call the approach Non-Disclosed Conformal Prediction (NDCP), and we evaluate it on a regression dataset from the UCI machine learning repository using support vector regression as the underlying machine learning algorithm, with a varying number of data sources and sizes. The results show that the proposed method produces conservatively valid prediction intervals, and while we cannot retain the same efficiency as when all data is used, efficiency is improved through the proposed approach as compared to predicting using a single arbitrarily chosen source.

Presentation at DBDM/CCGrid

I’ve just presented the paper “Adapting The Secretary Hiring Problem for Optimal Hot-Cold Tier Placement under Top-K Workloads” at DBDM, CCGrid here in Larnaca, Cyprus.

 

The paper examines analytic solutions to optimization problems related to tiered/hierarchical storage under Top-K queries with HASTE, and its relation to the classic discrete optimization ‘Secretary Hiring Problem’.

Pre-Preprint

PhD students’ visit to Astra Zeneca, Gothenburg (April 15-17)

PhD students Håkan Wieslander, Phil Harrison and Ankit Gupta visited Astra Zeneca, hosted by Johan Karlsson and Alan Sabirsh. They had three intense days in the lab getting the high-throughput microscope to talk to the HASTE code. It’s not every day a computer scientist gets to dress up in a lab coat!
Johan explains the workings of the microscope
It’s not easy to debug
What’s an intense coding session without some finger-pointing

Ankit Gupta joins HASTE team as PhD student

We welcome Ankit Gupta as new PhD Student in the Wählby Lab at the Department of Information Technology, Uppsala University.

Ankit obtained his Bachelor’s in Electrical Engineering at Indian Institute of Technology Indore in 2014. Then, he completed his Masters in Medical Imaging and Informatics at Indian Institute of Technology Kharagpur in 2017. Before moving to Uppsala, he was employed as Research Engineer at the University of Bern where he worked on developing a video-based instrument tracking system in stereoscopic laparoscopic surgery.

About the PhD project within HASTE:  

Within the project, he will work on developing measurements for the early detection of informative data from large-scale spatial and temporal experiments.

Successful HASTE ‘all hands’ at Uppsala (Nov 7-9)

Johan makes a start on the fika…
Everyone presented their latest work, and discussed the latest image datasets from AstraZeneca and Vironova. During the software workshop session, we discussed linking the HASTE cloud pipeline to the Vironova MiniTEM.

Thanks to: Carolina Wählby, Ola Spjuth, Andreas Hellander, Ida-Maria Sintorn, Alan Sabirsh, Ernst Ahlberg Helgee, Johan Karlsson, Håkan Wieslander, Philip Harrison, Salman Toor, Ben Blamey, Håkan Öhrn, Markus M. Hilscher, Niharika Gauraha, Magnus Larsson, Oliver Stein, Andy Ishak

HASTE project featured in ‘Framtidens Forskning’

HASTE has been featured in ‘Framtidens Forskning’: “As more and more instruments are generating more and more data, we need new methods to not completely drown in data volumes. Our tools make it possible to know in advance where to focus the analysis, which greatly reduces time-consuming and streamlines resource usage” said Prof. Carolina Wählby, Principle Investigator for the HASTE project. Read the full article.