Iskra Telecom’s Task Tracker

Integrating a Local AI Summarizer Into Iskra Telecom’s Task Tracker

Iskra Telecom is a major Internet service provider serving thousands of private and corporate customers.

IceRock Development, in collaboration with Iskra Telecom, developed and integrated a solution based on a Large Language Model (LLM) directly into the customer’s custom task tracker. A key feature of the project is its 100% on-premise deployment within the company's closed network.

The system analyzes long strings of comments in tasks (often ranging from 50 to 100+) and, with a single button press, generates a summary of the current status, completed work, and next steps. This solution has significantly reduced the time managers and operators spend getting into any particular task while completely eliminating the risk of confidential customer data leaking to external AI services.

Task

Iskra Telecom's business processes are built around a self-developed task tracker. The company’s management has implemented a mandatory rule stating that all work, especially related to customer requests (e.g., “lost Internet connection,” “technical connection,” etc.), must be documented in detail in the comments for each task.

While this solved the problem of missing history, it created a new issue of information overload, which was equally serious.

When a new employee joined a task or a manager wanted to get a sense of its status, they had to manually sift through dozens — and sometimes even hundreds — of comments. The essence of the problem was getting lost in a barrage of technical information, customer clarifications, internal communications, as well as general noise and flood.

We were given a clear business task:

Reduce the time spent on “getting into the task”: An employee should be able to understand the current status and history of a task in seconds, not hours.
Ensure 100% confidentiality: Tasks regularly involve customer personal data, addresses, and internal technical data. The use of external APIs such as ChatGPT, YandexGPT, or Claude was strictly prohibited.
Provide native integration: The solution had to be seamlessly integrated into the existing task tracker interface, rather than being developed as a separate app.

Solution

We proposed a solution that combined the power of modern LLMs with strict security requirements — a local AI summarizer.

We integrated a “Generate Summary” button into the task tracker interface. Clicking this button performs the following actions:

The system collects all data related to the task (description and complete comment history).
It sends this array of text to its own AI server deployed within the company.
The local language model analyzes the entire context and generates a concise, structured summary:
- Core problem: What was originally required.
- Work completed: Key actions already taken.
- Current status: What is happening right now.
- Next steps: What is planned to close the task.
This summary is instantly displayed to the user.

This way, any employee who opens a task can immediately grasp the issue without needing to read through a “wall” of comments, significantly boosting operational speed.

Development Process

The process was divided into several key stages:

Architecture and technology stack selection

The main challenge was the requirement to avoid cloud-based AI. The solution had to operate within a “closed loop.” We immediately abandoned the idea of external APIs and focused on delivering an on-premise solution.

This required three components:

Hardware: A server with sufficient VRAM (video memory or unified memory) to run a powerful model.
Inference server: The software that “serves” the model and provides a convenient API interface for our developers.
Model: An efficient and “smart” open-weight model capable of understanding Russian well and performing summarization tasks.

Hardware configuration

Following our recommendation, the customer purchased a specialized workstation — a Mac Studio with 96 GB of unified memory. This machine was chosen as the optimal solution in terms of price-to-performance ratio for local inference tasks (running AI). It was installed in the company's server room.

AI server deployment

We used LM Studio software. This tool allows you to download and run almost any open-weight model in just a few clicks. Most importantly, it automatically creates a local API server. This server fully replicates the OpenAI API, allowing our backend developers to use their familiar libraries and tools by simply changing the URL to the internal address of the Mac Studio.

Model selection and testing

We tested several models, including different versions of LLaMA and Mistral. Ultimately, the Qwen model (developed by Alibaba) showed the best results for summarizing technical texts in Russian. We experimented with different-sized versions to find the optimal balance between response speed and summary quality.

Integration

The integration process was the fastest stage:

Frontend: A “Summarize” button was added to the task tracker interface.
Backend: A new endpoint was created that collects all comments from the database upon request from the frontend, forms a single text prompt, sends it to the internal API address of LM Studio, receives a JSON response with the finished summary, and sends it back to the frontend.

The entire process, from purchasing equipment to putting it into operation, took about a week, demonstrating how efficient and quick deployment can be when you have the necessary expertise.

The Hardest Part

The hardest and trickiest part was finding a workaround for security restrictions.

The problem was that the comments from the tasks contained a vast array of confidential information, including customers' personal data (such as names and addresses), technical details of the network, and internal team discussions.

Transferring such information to any external service, be it ChatGPT, Claude, or their equivalents, was out of the question. This posed a direct risk of violating Federal Law No. 152 On Personal Data and incurring reputational damage for the telecom operator.

Regular, easy-to-integrate cloud AI solutions were strictly prohibited. We needed to figure out how to harness the full power of modern LLMs without sending a single byte of data outside the company.

How we solved these tasks

We solved this problem by building a completely isolated (“air-gapped”) AI loop within the customer's infrastructure.

Hardware independence: Instead of renting cloud GPUs, we opted to use a physical machine (Mac Studio) located in the customer’s own server room.
Software isolation: We used LM Studio for local model hosting. This software runs autonomously and does not require Internet access for its main function, i.e. processing requests.
Local model: The actual “intelligent” part, which is the Qwen model, is just a set of files (weights) that were downloaded once and placed on a local machine.

Thus, the entire lifecycle of a summarization request looks like this:
Employee's browser (internal network) ➔ Backend task tracker (internal network) ➔ Mac Studio with LM Studio (internal network) ➔ ... and back.

No packets containing confidential data ever escape the company’s environment. We solved the problem by demonstrating that cutting-edge AI technologies can be implemented even under the most rigorous corporate security requirements.

Technology Stack

Hardware (Inference)

Mac Studio M3 Ultra (96 GB RAM)

Model server (LLM Server)

LM Studio

Language model (LLM)

Qwen (Open-weight model from Alibaba)

Platform

Custom task tracker (Frontend/Backend)

Results

The results exceeded all expectations. We did not just add “another feature”; we fundamentally changed the speed of information processing within the company.

Mass adoption: According to the CTO, “almost everyone” uses the summarization feature. This is rather uncommon for a new corporate feature, highlighting its tangible value.
Time savings: Employees and managers save dozens of working hours per week. The time spent on “getting into a task” has been reduced from 15–30 minutes to just 2 minutes of reading.
Elimination of “noise”: The AI model effectively filters out all the “junk” (greetings, “OK” messages, deadline changes, etc.) and leaves only the factual essence of the matter — something that was impossible with manual reading.
Complete security: The customer received a powerful AI tool that is fully compliant with their security policy.

This case clearly demonstrates that you do not need to sacrifice security to implement AI. The right architecture and use of on-premise solutions allow you to take full advantage of the technology while maintaining complete control over your data.

Let’s discuss your project!

It is free. We will tell you how the application will solve your problems.