TAMA, original anti-DDoS software
DDoS (Distributed Denial of Service) attacks are already an inseparable part of the Internet, whether we, as users, realize it or not. If a service or website is not working, it could indicate that it has become a target of a DDoS attack. This type of attack consists of exhausting the capacity of specific resource in the network. Of course, dealing with this type of threat has become the bread and butter for all online service providers.
The battle for availability of services is being waged by all methods: from manual and crude blocking of an attack target (and saving bystanders), to automated, sophisticated and multi-stage filtering processes. Of course, the more convenient, less involving and interfering the solution is, the higher its price, and the DDoS attacks are characterized by the gap between the cost of their organization, and the resources you need to engage in order to protect your system. The nature of the distributed network of attack sources means that, in principle, only telecom operators (ISPs) are able to effectively organize defence. EXATEL walks down this road and moreover – we are one step ahead.
Attack techniques are evolving each day, and it is imperative to keep security up to date. EXATEL decided to create its original TAMA solution as an answer to the need to directly influence the development of the DDoS protection system depending on real (and varied) requirements of the customers that are subject to attacks.
Thanks to our ability to constantly analyse the characteristics of network traffic, when we detect an unusual event and classify it as malicious, we can immediately react and efficiently filter malicious traffic out from the network. This approach allows the attacked service to operate normally.
What is TAMA?
TAMA is a scalable and powerful software solution that protects your network from DDoS attacks. EXATEL’s solution is created in a security-as-a-service model. Protection against volumetric DDoS attacks is based on a central platform.
TAMA consists of several components:
- Aperture monitors network traffic from edge routers, aggregates statistical information and forwards it to the Controller.
- Controller integrates the information from the probes in the form of “current status on the monitored network”, stores it in the analytical database, makes decisions on alarm detection, upholding and closing, starts and stops automatic mitigations.
- GlaDDoS is a filtering unit. It is scalable. The throughput per GlaDDoS depends on the settings of the mitigation policy and the parameters of the server it is running on. To achieve the best efficiency, our filter units are geographically dispersed.
- Chell is a management console that allows administrators and operators to take care of the security of our clients’ networks.
- Client portal allows our clients to view alarms and mitigations triggered in their nodes and monitor traffic in their network.
How is TAMA innovative?
- The structure of this solution is based on publicly available hardware in x86 architecture – without expensive FPGAs and ASICs.
- Achieving throughput of 100 Gb/s through the use of efficient (vertical and horizontal) scaling techniques.
- Proprietary mechanisms and techniques incorporating elements of machine learning.
- Ability to simultaneously protect multiple customers with different requirements and to protect connections independently from the provider’s actions.
- Development of a fast and flexible decision engine for threat identification and neutralization.
TAMA – Polish solution that protects you against DDoS attacks
Watch this short video and check how TAMA works
This solution allows your organisation to gain:
Dowiedz się więcej
- TAMA is created by the Polish telecommunications carrier, a State-owned company
- The carrier-class protection – we protect against attacks far bigger than the network capacity of our Clients
- HA architecture that enables business continuity
- Protection adapted to the network capacity
Dowiedz się więcej
- Adaptation to the needs of a Client
- Protection regardless of the network provider
- Protection independently from other Client’s solutions
- Integration with other security systems (ex. SIEM)
- Possibility to filter the traffic even on the 7th layer
Dowiedz się więcej
- Automatic and manual options to react to attacks
- Possibility to create exclusions in the detection proces
- Possibility to manage the sessions during mitigations. Options to configure the possible number of open sessions per source, minimal number of packets and its size
- Possibility to define different time for triggering the alarm for detection methods
- Customer Portal – access to current statistics, possibility to generate reports
Dowiedz się więcej
- The best value-to-price ratio on the market
- 24/7 service by EXATEL (without any ressources needed on the Client’s side)
- Possilibility to test the solution for free
What is ARFA?
ARFA is a continuation of the TAMA project, i.e. a set of additional modules that the TAMA anti-DDoS solution will be equipped with.
With ARFA it will be possible to prevent:
- new volumetric attacks (thanks to adding new techniques in the area of DDoS attack detection and mitigation)
- attacks on service server resources (including fragmentation attacks)
- attacks on the application layer
- BGP hijacking attacks
Work on the TAMA EXATEL project
Chief Engineer – Advanced Security Services
For me, every day at EXATEL means intensive development and stepping out of my comfort zone. Examples? Participating in R&D projects and cooperation with great scientists is an opportunity to work on the details of cyber security solutions. However, my standard workday also includes business activities. I participate in the commercialization of technologies created “from scratch” and developed in Poland. This means that I have a say in what these solutions will look like. These are pretty rewarding challenges. It is hard to find a better environment for self-development.
Senior Software Developer
There are 3 things that are important to me as a software developer – who I work with, what I work on, and why I do it. I appreciate cooperation with people who are willing to work, have seen and work with many different systems and are constantly learning and looking for new solutions. That is why I work on R&D projects at EXATEL. People, challenges, responsibility, and the possibility to influence the project design – all this gives me satisfaction in my daily work.
What is the tech stack in the TAMA project?
The project is made up of several components:
- aperture – probe in C++17
- gladdos – scrubber in C++17 and dpdk
- chell – system configuration, operator, and administrator interface; consists of python/flask backend and angular frontend – we use current versions
- controller – consists of the detector and the queue workers, also collection of statistics
We use the Debian Stable and the packages that come with it.
We use NoSQL databases: redis and elasticsearch. All components run in docker containers and are managed by docker-compose in pre-production environments and docker-swarm on production
What does test automation look like?
For test automation, we use Jenkins, which runs a series of tests for each of the system components after receiving notifications from bitbucket.
Some of the more complicated tests are integration tests of GlaDDoS filter mechanisms. They automatically run a filter on virtual local TAP interfaces, connect to them using python, and use Scapy to generate packets and observe the GlaDDoS response to the traffic sent.
How to store IPv4 and IPv6 lists efficiently?
In both the GlaDDoS filter and the Aperture probe, we have a need to pull information like “what configuration applies for a given IP address”. It is often necessary to use “Longest Prefix Match” as well, i.e. configuration for 188.8.131.52/24 has a lower priority than 184.108.40.206/28 if asking for 220.127.116.11, but applies to a request for 18.104.22.168. Sometimes we want to know all applying configurations (so for 22.214.171.124 both 126.96.36.199/28 and 188.8.131.52/24).
Hardware solutions to this problem often include a quite expensive TCAM memory – however, this way we create a pure software solution. A commonly used algorithm to solve this problem is LCTrie – we use the simple TriTrie algorithm, which in practice works faster for small datasets than the found reference implementation of LCTrie and can be easily extended so that it works for:
- IPv4 and IPv6,
- Both longest-prefix-match and “all match”,
- Both small configuration data (thousands of entries) and GeoIP (covering the entire IPv4 space).
TriTrie is statically generated and it is not modified afterwards. Where we need to keep a dynamic address list, and the code must be efficient and must not use locks (i.e. you cannot allocate memory), we use bloomfilters – where adding an IP address comes down to setting some bits to 1. Bloomfilters may cause false-positive errors, so they cannot be used everywhere. You cannot remove added addresses from them, so every so often they are “rotated” and replaced with new, clean ones.
How do you document your code?
- Commented code (doxygen)
a. Explain the intent of a method, or class, or code snippet if necessary. It makes it easier to understand what the author meant and verify whether it really works that way. When making changes, it helps to assure that we do not damage the intended behaviour.
- System documentation in a repository (sphinx)
b. concept of operations
c. division into components
d. main interfaces
e. technical documentation of filter operation (what types of attacks we detect and how)
f. GlaDDoS network model – how it works as a network device
g. documentation for the administrator
h. documentation of roles and permissions in the system
- Changelog – description of changes since the previous version:
a. What is the business case for the change?
b. Release notes: which changes have to be done manually during deployment (e.g. related to data model migration).
c. What are the technical changes relevant to the team?
- Analytical, operational and system documents (confluence)
a. What are the functional or visual changes perceived by users?
b. User documentation
- JIRA Task Board
Why are you not using the most popular cloud solutions today?
There are several reasons. The first and most important is that as a critical infrastructure operator we are responsible for the reliability of services for the most important institutions in the country. It is a huge responsibility, but thanks to it we have the infrastructure, knowledge and experience allowing us to ensure such reliability and high availability. Therefore, the basic advantage of the cloud is not as crucial for us as it is for companies that do not have such an infrastructure.
We maintain a few hundred systems for our own and clients’ needs, the vast majority of which is virtualized, which allows us to flexibly scale resources and reduce operating costs – which means that the second great advantage of the cloud, after detailed calculation, in practice turns out to work equally well in our model.
And last but not least – control over the data. When we put our data in the cloud – so in reality, on someone else’s computer – we have to trust that the owner of the cloud infrastructure is taking proper care of our data, protecting it from loss or disclosure, and not profiting from it in any other way. When we store our data on our own resources, we take this problem upon ourselves. At the same time, this is our service that we offer to customers who want to trust us.
In practice, we are a cloud service provider ourselves and we are constantly expanding the scope of our knowledge and product offerings.
Where do the requirements and ideas for new functionalities come from?
We rely on the expertise of people working in our cyber security department and our SOC operators who directly handle attacks and our clients’ network security.
We keep abreast of the latest trends and technologies used in the cybersecurity industry.
In our research we work with consultants from technical universities and their students.
Do you have a good SCRUM? Did it work well for you?
The use of agile methodologies allows us to continuously develop the product and implement new functionalities, as well as collect comments and suggestions from end users. Thanks to daily meetings, we can solve problems on an ongoing basis and each team member has the opportunity to learn about the progress we’ve made. After each completed sprint, the system gains new features that are used by operators or clients. According to agile methods, each completed sprint is analysed by the team for possible improvements in the product development process.
The agile method of project management worked very well while working on the TAMA system.
How do you communicate as a team? Mumble, Zoom, Rocket.Chat, Signal. Why not Slack?
We do not use Slack because you cannot install it on premise, and it is not end-to-end encrypted. We use a variety of tools as needed.
Signal – for quick “current” communications to send notifications. Cloud – but end-to-end encrypted and open source. Use occasionally and with less clutter.
Mumble– virtual presence, a bit like a radio. Usually the whole team is available “online”, so you can always just “yell someone’s name” if needed, just like you would do in the office.
Rocket.Chat– all written “current” communication. There are dedicated channels and bots here that notify us about important events like broken (or fixed) builds and performance test results. This solution is quite handy and allows us to collect all important information in one place, so it facilitates our workflow.
Zoom – we use it when we want to see each other, or it is necessary to display our screen. We have our own on-premise server, so it is a bit safer, privacy-wise.
How do you test your app? How do you ensure code quality?
We use Git with a remote BitBucket server for version control. We organize tasks in JIRA, and we write documentation on Confluence and directly in code (sphinx).
The application is tested on many levels, each component has its unit and functional tests.
For CI we use Jenkins and the following additional tools, depending on the platform:
- linting (pylint, tslint, pycodestyle)
- formatting (clang-format, cmake-format, prettier, robot-tidy)
- static code analysis (SonarQube)
- unit tests (GTest, pytest, jasmine)
- integration tests (selenium, RobotFramework, scapy)
- performance tests (pktgen)
If the code does not pass the tests, it cannot be merged into the master – you cannot release that version. We use quality gates:
- Code coverage of tests must exceed 85% of lines
- The number of linter errors cannot increase
Correct test results allow deploying the version to the test environment. The process is triggered on-demand and it is done automatically, using a parameterized job on Jenkins.
At the end of the release process, our testers perform manual testing of the application. We organize test scenarios in JIRA using the Zephyr plugin.
The project is co-financed by the National Centre for Research and Development under the CyberSecIdent program “Cyber Security and e-Identity” (pol. “Cyberbezpieczeństwo i e-Tożsamość”.)