Ron Marcovich, M.Sc. Thesis Seminar
Advisor: Prof. Orna Grumberg, Dr. Gabi Nakibly
Protocol Inference is the process of gaining information about a protocol from a binary code that implements it. This process is useful in cases such as extraction of the command and control protocol of a malware, uncovering security vulnerabilities in a network protocol implementation or verifying conformance to the protocol's standard. Protocol inference usually involves time-consuming work to manually reverse engineer the binary code.
We present a novel method to automatically infer state machine of a network protocol and its message formats directly from the binary code. To the best of our knowledge, this is the first method to achieve this solely based on a binary code of a single peer. We do not assume any of the following: access to a remote peer, access to captures of the protocol's traffic, and prior knowledge of message formats. The method leverages extensions to symbolic execution and novel modifications to automata learning. We validate the proposed method by inferring real-world protocols including the C&C protocol of Gh0st RAT, a well-known malware.