The ASIT TOOLBOX: THB Tools | The ASIT TOOLBOX

This page presents the tools explored and developed during the ASIT project.
Each section below includes:

A short description of the tool
Contact information for the responsible organisation

If you're interested in learning more or collaborating, feel free to reach out directly to the listed organizations.

We hope you find these insights useful and inspiring!

falkor logo

Falkor is the next-generation platform for data-driven investigations. We are leading the way in flexible and intuitive data fusion and case management with powerful data enrichment to help solve the challenges faced by analysts in law enforcement, cyber threat intelligence, and other agencies globally. Our system combines sophisticated AI innovation with human experience and intuition. Integrate any data, including internal databases, files, and OSINT. Visualize entities, events, and relations in maps, link analysis, timelines, and specialized dashboards. Organize your team's work in cases, generate automatic reports, and share insights with colleagues and decision makers. Collaborate securely with iron-clad, permission-based access and crack the case together. Falkor offers scalability, catering to the needs of both small teams and large organizations.

More info:

email: hello@falkor.ai website: Falkor.ai

HEROES Tools:

Citizen Reporting App (CR): Through this application, citizens can report potential incidents of CSA/E and/or THB that they have witnessed or become aware of. The report is then sent to the local police, who will evaluate the information provided and classify it to decide on the best course of action to take. TRL: 6 Partner: UNIKENT

More info:

Budi Arief - b.arief@kent.ac.uk.

Automatic CSAM/CSEM Identification and Classification Tool (ACPIC): Tool based on computer vision algorithms to automatically detect and classify CSA/CSE material in multimedia files (images or videos) to help LEAs save time in their investigations. This solution is a workflow using 3 separate modules, age/gender estimation, sensitive adult content detection and context/background captioning. TRL: 6 Partner: INRIA - UCM

More info:

Francois Bremond - francois.bremond@inria.fr, Luis Javier García Villalba - javierv@ucm.es.

Open Source Intelligence Tools (OSINT): Dockerised tool capable of retrieving intelligence from diverse social networks and other sources based on an input text. The tool's objective is to streamline criminal investigations by automating data collection and analysis, facilitating the identification of correlations and additional evidence. TRL: 7 Partner: ARC

More info:

Fran Casino - fran.casino@gmail.com.

Logo of ALUNA project

ALUNA Tools:

Generation of synthetic audios: Tool for multilingual and multi accent synthetic audio generation with voice style control (such as age, gender, language and accent), from text inputs. TRL: 5 Partner: UCM

More info:

Luis Javier García Villalba - javierv@ucm.es.

CSAM detection and classification: This tool involves the development of Deep Learning and audio analysis techniques for detecting sensitive content, specifically pornographic material in audio and video. TRL: 5 Partner: UCM

More info:

Luis Javier García Villalba - javierv@ucm.es.

Audio age/Gender estimation: This tool combines a custom pre-processing pipeline with a Whisper-based audio transformer to estimate speaker age and gender. TRL: 6 Partner: UNIKENT

More info:

Virginia Franqueira - v.franqueira@kent.ac.uk.

Automatic report generator engine: Extension of the Aluna platform integration that allows generating and downloading different kinds of reports using the results of the analysis done in the platform. TRL: 7 Partner: IDENER

More info:

Pablo Gallegos - pablo.gallegos@idener.ai.

Multimedia Audio Forgery detection: This task provides a tool for forgery detection in multimedia audio files. A customized dataset of both authentic and manipulated audio recordings is used, covering various forgery techniques such as copy-move and splicing. TRL: 5 Partner: UCM

More info:

Luis Javier García Villalba - javierv@ucm.es.

Analysis of encrypted mobile and storage devices: This tool generates tailored password guesses based on suspect personal information, ranking them from most to least likely. It is built on a GPT-2 architecture trained on real leaked passwords, adapted to comply with modern password requirements (e.g. at least one uppercase letter, one lowercase letter, a number, and a special character). TRL: 6 Partner: UNIKENT

More info:

Virginia Franqueira - v.franqueira@kent.ac.uk.

Cloud storage forensic: Methodological approaches to deal with distributed storage solutions and tools to allow investigators to advance in their tasks. TRL: 6 Partner: ARC

More info:

Fran Casino - fran.casino@gmail.com.

Audio Signatures Analysis: The aim of this tool is to help LEAs extract unique signatures from the audio of CSAM videos. TRL: 6 Partner: ARC

More info:

Fran Casino - fran.casino@gmail.com.

Perceptual Hashing and Metadata Analysis: This tool applies perceptual hashing to identify manipulated or altered images and videos by comparing them to a secure database of known content. TRL: 6 Partner: UNIKENT

More info:

Virginia Franqueira - v.franqueira@kent.ac.uk.

Speech Recognition: The tool developed is oriented towards advanced processing of audiovisual content, incorporating multilingual capabilities for the recognition of named entities in Spanish, English and Portuguese. TRL: 5 Partner: UCM

More info:

Luis Javier García Villalba - javierv@ucm.es.

CESAGRAM AI-Based Solution
The CESAGRAM AI-based solution aims to enhance the capacity of law enforcement agencies to detect, prevent, and respond to online grooming activities across the Web and social media platforms. This solution integrates three core components into a user-friendly platform:
1. Online Data Gathering
2. Linguistic Analysis
3. Risk Assessment

1. Online Data Gathering

The CESAGRAM solution incorporates three specialised crawlers designed to gather textual content from online sources: a Web crawler and two social media crawlers for Twitch and YouTube, respectively. The Web crawler has been designed to systematically collect data from both the Surface and Dark Web. The Twitch crawler enables both synchronous and asynchronous monitoring of chatlogs associated with live video streams on the platform. It leverages the official Twitch API to collect chat utterances in real time from multiple user accounts involved in discussions related to particular gaming video streams of interest. Finally, the YouTube crawler is designed to monitor YouTube and extract comments associated with videos uploaded to the platform using the YouTube Data API.

2. Linguistic Analysis

The Linguistic Analysis tools (i.e., Named Entity Recognition, Sentiment Analysis, Emotion Analysis, Grooming Taxonomy Classification, and Authorship Analysis) provide advanced capabilities for analysing textual data collected by the online data gathering tools. In particular, they are designed to identify named entities (e.g., people, locations, organisations, and time-based events), assess sentiment (i.e., positive, negative, or neutral), detect emotions (i.e., happiness, anger, fear, sadness, disgust, and surprise), classify content according to grooming behaviour stages, and analyse writing patterns and stylistic factors.

3. Risk Assessment

The Risk Assessment tool estimates the risk level related to the existence of potential grooming behaviour in online spaces. In particular, the tool utilises the outcomes of the Grooming Taxonomy Classification, applied to user messages and comments in the online conversations of interest, and provides an estimation of the risk level per user related to the existence of potential grooming incidents. Overall, four levels of risk are supported: (i) Very Low: Grooming behaviour is highly unlikely; (ii) Low: Grooming behaviour is unlikely; (iii) Moderate: Grooming behaviour is likely; (iv) High: Grooming behaviour is highly likely.

More info:

e-mail: cesagram@iti.gr