Insight Centre for Data Analytics - Conference Items

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 41
  • Item
    Approximating a global objective by solving repeated sub-problems for an oven scheduling problem
    (ModRef 2024, 2024-09-02) Simonis, Helmut; Science Foundation Ireland
    In this paper we describe results for an oven scheduling problem studied during the European ASSISTANT project. This is a multi-stage scheduling problem arising in the production of rotor assemblies for compressors, provided by one of the industrial partners in the consortium. The main resource type is a set of identical ovens, which are used to heat-treat components in different ways. The process for one product may require multiple consecutive steps using these ovens, with specific temperature and process requirements at each step. Multiple tasks of different orders can be processed together in the same oven, if the temperature and process parameters for the tasks are identical. Processing multiple tasks together is more energy efficient, but typically forces some tasks to wait until all scheduled items are available, possibly impacting product quality and creating delays for the orders. The main difference to the oven scheduling problem studied in the literature is that we are not just trying to find an optimal solution to the short-term, detailed scheduling problem, but rather are interested in how selecting different parameters and constraints for the short-term scheduling problem affects the overall long-term, global objective of minimizing energy use, while maintaining the quality of products. Turning ovens off and then on again is considered bad for energy and maintenance reasons, we therefore try to minimize the number of shutdown events over the full planning horizon, while dealing with demand fluctuations over time. Information about jobs to be scheduled is only available within a limited time horizon, we therefore cannot solve the overall problem as one global optimization problem. Results indicate that we obtain a good overall schedule with a simple detailed scheduling model.
  • Item
    Irish-based Large Language Model with extreme low-resource settings in machine translation
    (ACL, 2024-08-11) Tran, Khanh-Tung; O'Sullivan, Barry; Nguyen, Hoang D.; Science Foundation Ireland
    Large Language Models (LLMs) have demonstrated exceptional performances in a wide range of natural language processing tasks. However, their success does not always extend to machine translation, particularly in challenging scenarios such as translating low-resource languages. This study investigates the multilingual capability of LLMs, with a case study on Irish, an extremely low-resource language, focusing on translation tasks between English and Irish. We propose a dynamic, efficient language adaptation framework for English-centric LLMs, which involves layer-specific adjustments and subsequent fine-tuning for machine translation. Our findings highlight several key insights: (1) different layers in the LLM serve distinct functions such as language understanding and task reasoning, (2) effective translation requires extensive pre-training on both source and target languages, and (3) targeted fine-tuning for machine translation leads to significant improvements of 36.7% for English to Irish and 133.4% for Irish to English compared to the previous state-of-the-art.
  • Item
    Carbon stock estimation at scale from aerial and satellite imagery
    (Institute of Electrical and Electronics Engineers (IEEE), 30-07-2024) To, Alex; Pham, Hoang Quoc Viet; Nguyen, Quang H.; Davis, Joseph G.; O’Sullivan, Barry; Pan, Shan L.; Nguyen, Hoang D.; Science Foundation Ireland
    In the ongoing efforts to mitigate climate change effect, the capability to reliably estimate forest carbon stock on a global scale is vital to support sustainable development. This entails the investigation of tree coverage from diverse forest ecosystems worldwide, necessitating a substantial volume of high-resolution images. This paper integrates a variety of remote sensing data sources, from aerial to satellite imagery, for the training and development of our AI system. Given the heterogeneous nature of these data sources, we develop a standardization method to ensure consistent image size and resolution between source platforms. Our harmonized dataset includes 86,088 training images and 21,768 validation images, each with a high resolution of 1.194 m2 per pixel. We introduce a novel technique for tree semantic segmentation which offers a more effective alternative to traditional individual tree crown delineation for large-scale tree coverage estimation. To assess the adaptability of our AI models, we conducted experiments on a hand-annotated satellite image test set and achieved a High Vegetation IoU score of 45.73%. Building on these findings, we present an interactive web-based Geographic Information System for navigating high vegetation segmented satellite images and estimating carbon stock on a global scale.
  • Item
    A constraint-based local search for designing tree networks with distance and disjoint constraints
    (Institute of Electrical and Electronics Engineers (IEEE), 2015-11-12) Arbelaez, Alejandro; Mehta, Deepak; O'Sullivan, Barry; Quesada, Luis; Seventh Framework Programme; Science Foundation Ireland
    In many network design problems clients are required to be connected to a facility under path-length constraints and budget limits. Each facility is associated with a tree network where the root is the facility itself and the remaining nodes of the tree are its clients. An inherent feature of these networks is that they are vulnerable to a failure. Therefore, it is often important to provide some resiliency in the network. We focus on a problem where we want to ensure that all clients are connected to two facilities so that if one facility fails then all clients can still be served by another facility. Optionally, one might require that each client is resilient to a single link or node failure by enforcing that the paths used to connect a client to its two facilities are either edge-disjoint or node-disjoint respectively. In this paper we use local search to evaluate the trade-off between cost versus resiliency and coverage versus resiliency for a real-world problem in the field of optical networks.
  • Item
    Parallelising the k-medoids clustering problem using space-partitioning
    (Association for the Advancement of Artificial Intelligence, 2013) Arbelaez, Alejandro; Quesada, Luis; Japan Society for the Promotion of Science; Science Foundation Ireland
    The k-medoids problem is a combinatorial optimisation problem with multiples applications in Resource Allocation, Mobile Computing, Sensor Networks and Telecommunications.Real instances of this problem involve hundreds of thousands of points and thousands of medoids.Despite the proliferation of parallel architectures, this problem has been mostly tackled using sequential approaches.In this paper, we study the impact of space-partitioning techniques on the performance of parallel local search algorithms to tackle the k-medoids clustering problem, and compare these results with the ones obtained using sampling.Our experiments suggest that approaches relying on partitioning scale more while preserving the quality of the solution.