We are happy to present the joining of Xiaobo Zhao as our newest member of HASTE. Xiaobo Zhao is joining the group of Andreas Hellander to work as a PostDoctoral Researcher. Xiaobo will be working on research and development of intelligent stream data processing pipelines, and the development of intelligent and efficient cloud systems capable of mapping data and compute to a variety of cloud computing and data storage e-infrastructure based on the quality and interestingness of the data.
Xiaobo Zhao received the M.S. degree in Communications and Information System from Northwestern Polytechnical University, Xi’an, China in 2015. He later received a Ph.D. degree in Electrical and Computer Engineering from Aarhus University, Aarhus, Denmark in 2020. Before joining the Hellander lab, he was a Research Assistant at Aarhus University and focused on efficient ML/DL service offloading to Edge/Cloud servers.
We are happy to announce that our paper “Deep learning models for lipid-nanoparticle-based drug delivery” is now available ahead of print and open access in the journal Nanomedicine.
Authors: Harrison PJ, Wieslander H, Sabirsh A, Karlsson J, Malmsjö V, Hellander A, Wählby C & Spjuth O.
Abstract: Background: Early prediction of time-lapse microscopy experiments enables intelligent data management and decision-making. Aim: Using time-lapse data of HepG2 cells exposed to lipid nanoparticles loaded with mRNA for expression of GFP, the authors hypothesized that it is possible to predict in advance whether a cell will express GFP. Methods: The first modeling approach used a convolutional neural network extracting per-cell features at early time points. These features were then combined and explored using either a long short-term memory network (approach 2) or time series feature extraction and gradient boosting machines (approach 3). Results: Accounting for the temporal dynamics significantly improved performance. Conclusion: The results highlight the benefit of accounting for temporal dynamics when studying drug delivery using high-content imaging.
In the figure below we show a schematic for the modelling approach used in the paper that combined convolutional and recurrent neural networks (long short-term memory, LSTM). This model is used for predicting information only present in the GFP channel at the end of the experiment from other imaging channels captured during the early time points of the experiment, prior to any GFP expression.
Following the win at the Adipocyte Imaging Challenge organized by AstraZeneca, two PhD students from the team, Ankit Gupta, and Håkan Wieslander were asked to comment in a technical report in Nature on the topic of virtual staining.
I’ve just presented the paper “Adapting The Secretary Hiring Problem for Optimal Hot-Cold Tier Placement under Top-K Workloads” at DBDM, CCGrid here in Larnaca, Cyprus.
The paper examines analytic solutions to optimization problems related to tiered/hierarchical storage under Top-K queries with HASTE, and its relation to the classic discrete optimization ‘Secretary Hiring Problem’.
PhD students Håkan Wieslander, Phil Harrison and Ankit Gupta visited Astra Zeneca, hosted by Johan Karlsson and Alan Sabirsh. They had three intense days in the lab getting the high-throughput microscope to talk to the HASTE code. It’s not every day a computer scientist gets to dress up in a lab coat!
Everyone presented their latest work, and discussed the latest image datasets from AstraZeneca and Vironova. During the software workshop session, we discussed linking the HASTE cloud pipeline to the Vironova MiniTEM.
Thanks to: Carolina Wählby, Ola Spjuth, Andreas Hellander, Ida-Maria Sintorn, Alan Sabirsh, Ernst Ahlberg Helgee, Johan Karlsson, Håkan Wieslander, Philip Harrison, Salman Toor, Ben Blamey, Håkan Öhrn, Markus M. Hilscher, Niharika Gauraha, Magnus Larsson, Oliver Stein, Andy Ishak
HASTE has been featured in ‘Framtidens Forskning’: “As more and more instruments are generating more and more data, we need new methods to not completely drown in data volumes. Our tools make it possible to know in advance where to focus the analysis, which greatly reduces time-consuming and streamlines resource usage” said Prof. Carolina Wählby, Principle Investigator for the HASTE project. Read the full article.
The HASTE team are pleased to announce the availability of a new publication of the arXiv pre-print service: ‘Apache Spark Streaming and HarmonicIO: A Performance and Architecture Comparison‘. We performed a benchmark analysis to compare two stream processing frameworks – the popular, Apache Spark framework, widely used in industry, and our own framework HarmonicIO (presented this summer at IEEE Cloud 2018 in San Francisco ).
Previous studies have demonstrated that Apache Spark, Flink and related frameworks can perform stream processing at very high frequencies, but they tend to focus on small messages with a computationally light ‘map’ stage for each message; a common enterprise use case (for example, processing JSON documents). In academic HPC contexts, we often want to analyze larger messages, with more CPU-intensive computations. Our study adds to these benchmarks by broadening the domain to include such processing loads – larger messages (leading to network-bound throughput), and that are computationally intensive (leading to CPU-bound throughput) in the map phase; in order to evaluate applicability of these frameworks to scientific computing applications.
We find that relative performance varies considerably across this domain, with the chosen means of stream source integration having a big impact. Most interestingly, we find that Spark performs very well for large (~10Mb) and small message sizes (~1Kb), but for medium-sized messages, it can be out-performed by HarmonicIO in some configurations. These message sizes are relevant to HASTE, because such file sizes are typical of microscopy applications.
We offer recommendations for choosing and configuring the frameworks, and present a benchmarking toolset developed for this study.
We had a successful project meeting in Uppsala/Stockholm last month – Håkan Wieslander presented his latest research on image feature analysis, Phil Harrison his latest conformal prediction models, Ben Blamey demonstrated the prototype HASTE pipeline, Niharika Gauraha her work on SVM+. Alan Sabirsh and Johan Karlsson explained a little more about their work at Astrazeneca.
On day 2, we visited Vironova in Stockholm, and were treated to a hands-on demo of their MiniTEM electron microscope – and discussed plans for the next project phase.