See Google Scholar for an up to date publication list.
GigaScience, Volume 10, Issue 3, March 2021 Paper (OUP)
ITiCSE-WGR '20: Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education. June 2020 Paper (ACM)
The paper examines analytic solutions to optimization problems related to tiered/hierarchical storage under Top-K queries with HASTE, and its relation to the classic discrete optimization ‘Secretary Hiring Problem’. DBDM/CCGrid 2019. Paper (IEEE) Pre-Print Slides
2020 IEEE International Conference on Big Data (Big Data). Paper (IEEE)
IEEE Services 2018. IEEE Xplore
Phase A extracts information, focusing on natural language processing; new techniques are developed; including a novel distributed approach to handling temporal expressions, and a parser for social events (such as birthdays). Information is also extracted from image and metadata, the resultant annotations feeding the subsequent event clustering. Phase B performs event clustering through the application of a number of pairwise similarity strategies -- a mixture of new and existing algorithms. Clustering itself is achieved by combining machine-learning with correlation clustering.
This thesis presents SAESNEG, a System for the Automated Extraction of Social Network Event Groups; a pipeline for the aggregation of the personal social media footprint, and its partitioning into events, the ``event clustering'' problem. SAESNEG facilitates a reminiscence-friendly user experience, where the user is able to navigate their social media footprint. A range of socio-technical issues are explored: the challenges to reminiscence, lifelogging, ownership, and digital death. Whilst previous systems have focused on the organisation of a single type of data, such as photos or Tweets respectively; SAESNEG handles a variety of types of social network documents found in a typical footprint (e.g. photos, Tweets, check-ins), with a variety of image, text and other metadata — differently heterogeneous data; adapted to sparse, private events typical of the personal social media footprint.
The main contributions of this thesis are the identification of the technical research task (and the associated social need), the development of novel algorithms and approaches, and the integration of these with existing algorithms to form the pipeline. Results demonstrate SAESNEG's capability to perform event clustering on a differently heterogeneous dataset, enabling users to achieve lifelogging in the context of their existing social media networks.
SGAI 2013: Research and Development in Intelligent Systems XXX. pp 389-402.
Detecting and understanding temporal expressions are key tasks in natural language processing (NLP), and are important for event detection and information retrieval. In the existing approaches, temporal semantics are typically represented as discrete ranges or specific dates, and the task is restricted to text that conforms to this representation. We propose an alternate paradigm: that of distributed temporal semantics - where a probability density function models relative probabilities of the various interpretations. We extend SUTime, a state-of-the-art NLP system to incorporate our approach, and build definitions of new and existing temporal expressions.
SGAI 2012: Research and Development in Intelligent Systems XXIX. pp 207-212.
Awarded Prize for Best Poster at BCS SGAI 2012.
If you use the data - please cite the paper! ☺