Rossum's Robots
US Cyber Command Challenge Problem Prize Competition

Description

Image from From Karel Capek’s play Rossum’s Universal Robots
This event is open to students at all colleges, universities, professionals, and non-traditional participants who have experience in machine learning applications to network packet capture (PCAP) for automated anomalous behavior identification on a network that includes control systems.
Wikipedia and Matt Simon from Wired tell us that in 1920, Czech writer Karel Capek wrote the play Rossum’s Universal Robots or Rossumovi Univerzální Roboti. This play is believed to be the introduction of the word robot to the English language. In this play, roboti (robot) are produced in a factory as artifical beings that resemble androids from Hollywood movies instead of the traditional bots of metal, plastic and servos. These roboti start rebelling and end up killing nearly all humans on the planet. In this play, it’s easy to spot the roboti, but here at DreamPort, spotting the cyber attacker even after the fact let alone while attacks are in progress, can be much harder.
Overview
Machine Learning influences almost everything MISI does at DreamPort. This challenge is a:
- Robot prototype event
- Researching prototype event
We are challenging interested participants to:
- Analyze suspected ‘good’ traffic from the Hack the Port competition environment
- Receive and label (with MISI’s help) select known bad traffic from Hack the Port competition environments
- Build a classifier of anomalous network events using the algorithms of your choice
- Build a single script/executable to ingest new PCAP (on your test days) using your model to determine if and where attacks inside of Hack the Port competition networks are being launched.
Remote Participation
This event is well-suited for remote participation although you must ensure you can download large amounts of PCAP files. During evaluation, we will issue new PCAP during a virtual teleconference asking you to run your analytic in a screen-sharing session while we watch.
How Will It Work
Thanks to our partners, we are capturing all traffic on the Hack the Port networks. To participate in this RPE you must download the initial ‘known good’ and ‘known bad’ PCAP once released via Amazon S3. To be added as an eligible host, you must communicate your public facing IP address you will use to download the PCAP from DreamPort S3. You should assume that this traffic comes from hosts where network time synchronization is used (for those hosts who support it).
You should begin analysis of this PCAP immediately. Can you identify all transmitting hosts? Can you determine protocols and ports? Next, what are the most effective features for your models? Which algorithms are the most effective?
At this point, we will meet virtually to discuss labeling the PCAP data. We know all our hosts, assets and thanks to our recent events, we know who the attackers were. We can provide detail on each host manufacturer, purpose and how these hosts connect with each other. We will release a network map with details on how each attack was launched in the training data you receive.
Now, it’s on you. You need to work. Select your features, build a representative model score to see if you can spot the past attacks. Wash, rinse repeat. If you can make this work. Get ready for the evaluation. We discuss this later.
NOTE: you will not be eligible to participate as an attacker in our Hack the Port competitions simultaneously as this will constitute an unfair advantage.
Requirements
Here are the requirements you must meet if you wish to be considered a valid contender for the prizes:
- You must produce a single machine learning model object. If you are using separate algorithms (instead of a fused model) with separate inputs, you still must architect your code to decompress, unpack or the like from a single input file. There is no upper limit on size of this file.
- You must produce a single script or executable capable of training new PCAP and analyzing suspect PCAP. Everything should be done from 1 script. There is no strict limit on which programming language you choose
- The ‘analyze’ output function of your script/program must supply the IP address or MAC address of the suspected attackers. You are free to supply a collection of hosts but if you do, you must use a numerical value to indicate confidence or likelihood of which host(s) are most believed to be attackers.
- You must turn over copies of the source code and binary model(s) used for analysis to be eligible for awards but you will retain intellectual property rights
- You must provide outputs from execution of your script during formal evaluations
- You must detect a minimum of 50% of attacks during live competition to be considered a winner.
Schedule
This will take place during the conference. The specific date and time will be released soon.
Evaluation
We are baselining the Hack the Port networks and will conduct a set of attacks against systems on-net. The traffic resulting from these attacks will constitute your evaluation inputs. We will not be telling you which attack is contained in which PCAP file and won’t be telling you where we launch from. We will launch a minimum of five (5) attacks. Each attack will be delivered to you as separate PCAP. You must run your analysis code against this PCAP using the model you create and return the output. We will be asking you to run this analysis live in a virtual meeting screenshare so we can observe the output.
The principal evaluation criteria are now as follows:
- Did you correctly identify the attacker IPs?
- Did you correctly identify the target IPs?
- Did you correctly identify network ports?
- Did you correctly classify the attack type?
- Did you achieve a 50% success rate of attacks launched to your detection?
We identify the following additional formal criteria for ranking performance. Notice these are not identified as requirements, but we use these to a means to identify stellar performers. Do not work on these features instead of the actual hard requirements as we discuss in the previous section.
- Are you able to determine and report the intermediate hosts in an attack?
- Are you able to classify and report the protocols (not ports) used in the attack?