Generative Adversarial Networks (GANs): A Survey on Network Traffic Generation

—Generating network traffic flows remains a critical aspect of developing cyber and network security systems. In this survey, we first consider the history of network traffic generation methods and identify the weaknesses of these. We then proceed to introduce more recent approaches based on machine learning (ML) models. In particular, we focus on Generative Adversarial Network (GAN) models, which have developed from their initial form to encompass many variants in today’s ML landscape. The use of GANs for generating traffic flows that have appeared in the literature are then presented. For each instance, we present the architecture, training methods, generated results, identified limitations and prospects for further research. We thus demonstrate that GANs are key to future developments in network traffic generation and secure cyber and network systems.


I. INTRODUCTION
In developing, analyzing, and appraising secured networks and cyber monitoring systems, network traffic flows play a crucial role. Accessing sufficient real network traffic that is appropriate for this purpose has remained a challenge due to existing and increased privacy and security concerns. Publicly available real traffic is largely inconsistent, insufficient, or incomplete thus limiting how much is achieved relying on it. This drives a need to generate synthetic traffic, especially for research and analytical purposes.
The process of generating synthetic traffic involves extracting key characteristics of real network traffic and using these to generate similar network traffic flows. Although a very complex process, research has shown that this is possible. Several generation techniques have been implemented over time with each successive method attaining improved generation levels over previous approaches, albeit with associated limitations. This has recently culminated in the use of Deep Learning models, particularly Generative Adversarial Networks (GANs), which are the particular focus in this survey.
In this study, we consider the range of existing network traffic generation methods, and further make the following key contributions:  We highlight various evolutionary methods developed  We discuss the limitations encountered by the various approaches and show how each successive method overcomes these limitations.
 We show how Deep Learning techniques, particularly GANs, have surpassed previous state-of-the-art methods, delivering enhanced results culminating in packet byte level generation. This is despite their implementation the network traffic generation domain and has suggests further research is needed on the application of GANs in this area.
To this end, the paper is divided into sections as follows. Section I.A surveys earlier traffic generation methods and their shortfalls, Section I.B introduces GANs and highlights seven key models that form the basic architecture of most evolving models. Five GAN models trained for network traffic generation are presented in Section II, and further discussed in detail showing their architecture, training process and results. The survey concludes in Section III with discussion of several observations and conclusions highlighting the prospects for GANs is this area.

A. The Evolution of Traffic Generation
Early traffic generation methods adapted the established Erlang telephony model [1] to attempt to reproduce traffic that was observed. This method used the Poisson distribution for packet arrivals [2], [3] and allowed configuration of a transfer probability matrix that could transfer protocol and port [4]. The method worked well if the network application was simple, but the performance became inconsistent with complex network traffic (particularly the assumptions concerning the packet arrival process and the Poisson distribution) [5], giving rise to Self-Similar models [6]. The ON/OFF traffic self-similar model generated traffic by aggregating multiple sub-streams and each sub-stream cycle (either ON or OFF) was seen to follow the Pareto distribution [7]. Multi Fractal measure, another self-similar model, applied a continuous spectrum to generate non-uniform fractal traffic [8], while the Fractal Gaussian Noise (FGN) model used the Fast Fourier Transform to generate asymptotic self-similar traffic [9]. Self-similar models showed good consistency in results but did not reflect the true characteristics, particularly for packets and network flows [5].
To capture traffic characteristics during generation, various methods were implemented for flow-level and packet-level generation. Harpoon [10], [11] was used to generate representative packet traffic based on empirical distributions (file size, inter-connection times, and number of active sessions) to match byte, packet, and flow volumes of the original data at the Internet Protocol (IP) flow-level, this did, however, exclude packet loss and flow duration. Flow-level data distributions (up to packet-level) and reproduce the same (with subtle variants) within an application domain. This provides the motivation to study and review existing methods implemented for network traffic generation while identifying areas for further research.

II. GANS
The basis of GAN operation is the use of signal backpropagation to train two models, the Generator and the Discriminator, simultaneously pitting them against each other such that both models competitively strive to outdo the other in proving that the generated data is real or fake [26]. They have continued to gain increased attention due to their versatility and dynamic applicability. Several improvements have been made to the initial model of Goodfellow et al. (Vanilla GAN) [26], shown generically in Fig. 1 [28]. These include adding class conditions to enhance data generation representations, the incorporation of convolutional layers to enhance better data generation and regeneration, inference network extensions, and adversarial training for enhanced robustness and model training convergence speed [27]. With increasing and evolving applications, GANs have shown highly significant untapped potential in network traffic generation [28], [29] as well as a scalable hybrid architecture which is able to incorporate other model components (both supervised and unsupervised) while providing a network training platform. A. The Vanilla GAN [26] The Vanilla GAN incorporates two models in a corresponding minimax two-player game framework where the Generator ( ) models a transform function that strives to fool the Discriminator ( ) into mistaking generated data samples for real samples while models a discriminative function that estimates the probability that the sample data is from generated data or from the true data distribution. The input to is a low dimensional noise vector ( ( )), which it transforms into a data vector ( ( ; )) that is presented to as a potential data sample. The input to comprises ( ) and samples of real data ( ( )), and it produces an output that is a single scalar ( ( ; )) with a score that shows the likelihood of ( ) being from the original data distribution. The minimax objective function is: targets the minimization of log ( ) + log (1 − ( ( ))), aiming to make both ( ) and ( ) equal to 0.5 to confuse . At the same time, strives to correctly classify the fake versus the real data samples by maximizing International Journal of Machine Learning and Computing, Vol. 12, No. 6, November 2022 log ( ) + log (1 − ( ( ))) and forcing ( ) to equal 1.
Goodfellow et al. [26] optimized the model training using Stochastic Gradient Descent (SDG) and the minimax loss function in equation (1).

B. Conditional GAN (CGAN) [29]
The Vanilla GAN model was extended by Mirza and Osindero [29] by the inclusion of extra (auxiliary) information to condition the model. This extra information, , which is data from class labels or other modalities is combined as additional input layer and fed as input for and . CGAN, modified from (1), is represented in the two-player minimax objective function by: here, ( ) and are combined in a joint hidden representation as inputs for whilst ( ) and are presented as explicit inputs for . CGAN also optimizes model training using the SDG method.

C. Deep Convolutional GAN (DCGAN) [30]
To enhance stable GAN training, Radford et al. [30] incorporated Convolutional Neural Network (CNN) components into the GAN architecture by introducing constraints on its topology. To perform its convolutions, a CNN shifts a number of pixels, say , over the input matrix and is known as the stride [31].
DCGAN introduced fractional-strided convolutions, where a coarser output is connected to denser pixels by interpolation (that can be described as a fractional input stride, producing the name used) [32]. These allowed to learn its own spatial upsampling, and strided convolutions for to learn downsampling. G also uses batch normalization and rectified linear unit (ReLU) activation [33] (at all layers except a hyperbolic tangent function for output) while applies batch normalization and LeakyReLU [34] (for all layers). Fully connected hidden layers are also removed from deeper architectures, and models are trained with mini-batch SDG.

D. Wasserstein GAN (WGAN) [35]
The Wasserstein GAN, proposed in by Arjovsky et al. [35], made fundamental architectural changes to the Vanilla GAN which included replacing the Discriminator with a Critic ( ) that does not have to output the Sigmoid function and replaced the Minimax (BCE) loss function with the Wasserstein loss (W-Loss) [36] that approximates the distance between ( ) and G(z), and the amount moved. The objective function, which is modified from the standard GAN in (1), is thus represented by [37] where is the real distribution, is the generated distribution, and ̂ are uniformly sampled data points (̂) between and . The 1 -Lipschitz continuous condition is included for training to ensure that the W-Loss correctly estimates the Earth Mover's Distance (EMD) [38], which measures the distance between two probability distributions over a given region. [39] Gulrajani et al. [39] proposed an alternative method of enforcing the Lipschitz constraint on , which was based on weight clipping for the WGAN model that resulted in convergence failure or undesired behaviour. This approach, modified from the default WGAN model (3), is represented thus:

1) Wasserstein GAN with Gradient Penalty (WGAN-GP)
WGAN with Gradient Penalty (WGAN-GP) performs random interpolation between real and fake samples during training while penalizing the C's gradient norm with respect to its input. This is represented with a penalty coefficient parameter , that scales the gradient penalty.
2) Conditional Wasserstein GAN (CWGAN) [40] In [40], Fabbri proposed the Conditional WGAN (CWGAN) with improvements to the WGAN and WGAN-GP models incorporating the DCGAN architecture. The model included additional data as input for both and or while training applied the W-Loss function and the established Adam optimizer [41].

E. Bidirectional GAN (BiGAN) [42]
Donahue et al. [42] incorporated an Encoder ( ) into the Vanilla GAN model that enabled it to learn the inverse of . The proposed model, Bidirectional GAN (BiGAN), learns data mapping inversely for auxiliary supervised discrimination tasks. The objective function, based on (1), is given by: is included with for data mapping to latent representations, while jointly discriminates in data and latent space where the latent component is either the Encoder output ( ( )) or the Generator input ( ( )). is a non-linear parametric function, as are and , so is trained using gradient descent; , and are updated simultaneously at each iteration in alternating Stochastic Gradient steps.

III. GANS FOR NETWORK TRAFFIC GENERATION
GANs have been extensively applied for data classification and regression, image generation and synthesis, image-toimage translation, text-to-image generation, and enhanced image resolution generation [43]. Dewi et al. implemented various GAN architectures for the generation of improved and advanced traffic sign recognition [44], and synthetic prohibitory sign images [45], [46]. When evaluated with real data, results showed high resemblance and recognition accuracy.
These and several other recent works show that GANs have significant untapped potential in their ability to generate International Journal of Machine Learning and Computing, Vol. 12, No. 6, November 2022 high quality network traffic flows, making them highly relevant for network traffic analysis and synthesis [47]. This section discusses models that have been trained to generate network traffic flows, and to what extent generation has been achieved.

A. Model Architectures
Although the use of GANs to generate and analyze network traffic is a relatively new application, there have been several architectures designed and applied with some success. We now summarize these, concentrating mainly on their structure. [48] This recent development in GANs addresses the problem that typical Internet traffic has very different proportions of traffic from different applications, leading machine learning training to be dominated by the most commonly seen type.

1) Imbalanced Traffic Classification (ITCGAN)
ITCGAN, inspired by the triple-GAN [48] framework, is structured to include three modules as shown in Fig. 2. These are the Traffic Vectorization module that sorts and isolates a vectorized representation of imbalanced traffic features (training set), the Pre-training module that uses Net (a superior network) to train on the vectorized set and stores the pre-trained architecture parameters which are subsequently used as initial states for the Formal Training module. The last of these comprises the GAN framework that includes , and a Classifier ( ). is designed with Weight Generation Units ( ) , which each correspond to a minority class and learn a latent space's conditional mapping to vector = ( / ) of weights [48]. Unlike [26] that trains so as to map a uniform random distribution that is similar to ( ) to target data, is trained to learn and synthesize minority samples that fit the original distribution even though this differs from ( ). This is optimized and represented thus: where, and respectively indicate the real and synthetic conditional probability distributions of class , and − is the class size. ITCGAN attempts to minimize (7) to fool , and maximize (8) and (9) to enable predict the synthetic samples as real labels [48]. is designed similarly to the Vanilla GAN [26] and expressed thus: is obtained from the Pre-training module and is represented thus: The GAN architecture incorporates facilitation of correct classification of the imbalanced set while serving as a constraint to guide during training, and also providing an indication of successful generation thereby eliminating the need to focus on training convergence [48].

2) Packet generation of network traffic GAN (PAC-GAN) [28]
An improvement to the CGAN framework and a hybrid of CNN with the GAN architecture [28], PAC-GAN implements an inverse CNN architecture for , while uses the conventional CNN architecture usually employed for supervised classification. Network traffic packets are encoded by after first converting individual packet byte values for representation by subranges of sequential values and then duplicating the converted values for one-to-multi mapping (see Fig. 3 [28]). The conversion process is: where = ( n , … , 1 , 0 ) is the tuple containing the converted string of byte value digits and = ( n , … , 1 , 0 ) is the length string of packet byte value digits. The reverse operation −1 ( ) is performed on 's output to extract the actual packet byte values. is further decoupled and deployed for generation of traffic to be transmitted through the Internet. Fig. 4 shows the PAC-GAN architecture [28]. 3) Flow-Based network traffic generation GAN [49] Ring et al. [49] proposed three approaches to generate and transform flow-based traffic into continuous attributes, preprocessed and regenerated into new flow-based network data using WGAN-GP with a Two Time-Scale Update Rule (TTUR). These accepted network attributes as numerical values, created binary attributes from categorical attributes, and used a new similarity measure (IP2Vec) to learn vector representations from categorical attributes as shown in Fig. 5 [49]. Flow-based network traffic features comprising IP addresses, Destination Ports and Transport Protocols were extracted and served as input vocabulary with each value representing a one-hot vector, i.e., a group of bits containing only one logical one with all other bits set to logic zero [50]. Input and output layer neurons were each assigned specific values of the vocabulary and these layers (having the same number of neurons) were equal to the vocabulary size. The hidden layer neurons were fewer in number than the input layer neurons. The output layer used a Softmax Classifier that normalized the sum of all output neurons ensuring that it was 1, thus predicting the probability for each value of the vocabulary shown in the same flow as the input value.

4) Zipper network (ZipNet-GAN) [51]
ZipNet-GAN, proposed in [51], combined a new deep network, the Zipper Network, and GAN architectures tailored towards Mobile Traffic Super-Resolution (MTSR) to infer narrowly localized fine-grained mobile traffic patterns collected from aggregate coarse data measurements by a limited number of network probes with arbitrary granularity.
is constructed using a deep ZipNet architecture (see Fig. 6 [51]) and comprises 3D Upscaling Blocks for extracting spatial and temporal features specific to the mobile traffic, Zipper Convolutional Blocks as the core and Convolutional Blocks that predict the decision after summarizing distilled features received from the core. The 3D upscaling blocks are input and consist of a 3D deconvolutional layer, three 3D convolutional layers, a batch normalization layer and a Leaky ReLU activation layer. The core, which has 24 convolutional layers, a batch normalization layer and a Leaky ReLU activation layer, takes output from the 3D upscaling blocks. The convolutional blocks consist of three convolutional layers, a batch normalization layer and a Leaky ReLU layer with no skip connections. , which is based on a VGG-net neural network, consists of 6 Convolutional Blocks with the final layer employing a Sigmoid activation function that constrains the output to a probability range. The Convolutional Blocks include a convolutional layer, a batch normalization layer and a Leaky ReLU activation layer. Fig. 6. Architecture of G and D in ZipNet-GAN [48] showing the D upscaling blocks and Convolutional blocks for G′s architecture, and D based on the VGG-net framework [51].

5) Facebook chat network traffic GAN [52]
Rigaki and Garcia [52] proposed a GAN to imitate Facebook chat network traffic and modify the network behavior of real malware by mimicking the traffic of legitimate users while evading detection. and for this model were unidirectional and Recurrent Neural Networks (RNNs) modelled using the Long Short-Term Memory (LSTM) architecture. These used a Web Service (HTTP) to communicate with malware by exposing two API calls. These were get_params (that loads the saved model, produces new traffic parameters, and sends the same as a JavaScript Object Notation object to malware) and feedback (that loads the saved and models, adds the parameters of the previous time window to the current dataset based on feedback received and proceeds to another training round). The C2 channel is kept active and operational while HTTP facilitates communication over the channel to the 2 Server, and the Intrusion Prevention System (IPS) serves to secure the channel from non-Facebook chat traffic. The model framework is illustrated in Fig. 7 [52]. [53] Proposed to generate and augment Pcap data (Packet Capture data for analysis), PcapGAN comprises an Encoder International Journal of Machine Learning and Computing, Vol. 12, No. 6, November 2022 ( ) with four network data parts, that generates new data for each part of , and a Decoder that replaces . Information from Pcap data is extracted by and converted into features such as a graph (IP source → IP destination), an image (time interval), and a layer sequence structured from network data. Style (a vector value) is used to represent relationships between hosts (Server -Client and command and control Server -Botnet). Each data sample generated by is labelled by the edge style (that is, the style value of the relationship between hosts) and used in designing , which operates in a hybrid structured manner to generate new data which are combined with the reconstructed valid Pcap file by the Decoder. Figure 8 shows the PcapGAN architecture [53].

6) Packet capture file generator style-based GAN (PcapGAN)
PcapGAN uses a version of (1) modified by the addition of parameters to represent its objective function to produce:

B. Traffic Generation Results to Date
We now summarize the traffic generation results that have been obtained using various GAN implementations in the literature. In each case, we also summarize the structures and parameters that have been employed in the instances cited.

1) ITCGAN
Unlike previous GAN models, ITCGAN focused on solving the network traffic data imbalance problem. The Pretraining module trained for 300 epochs and used idea of focal loss, which is a method to place increased weight on rare samples. The Formal training module set the batch sizes for all models ( , and ) to 512 and used 40000 training steps, where the ITCGAN parameters were updated twice within a batch for every training step. and had fully connected layers and a learning rate of 10 −3 with a decay of 10 −4 while employed a learning rate of 3 × 10 −4 and decay of 10 −6 . ITCGAN used a ReLU activation function for hidden layers in both the Pre-training and Formal training modules, with optimization using the Adam optimizer [48].
To evaluate the results, baseline performance was established by training a classifier without addressing imbalance. Then, a range of metrics were considered to compare ITCGAN with established techniques, namely Random Over Sampling (ROS), Adaptive Synthetic Algorithm (ADASYNC), Synthetic Minority Oversampling Technique (SMOTE), SMOTE + Support Vector Machine (SMOTE-SVM), SMOTE + Tomek Links (SMOTE-TL) and a CGAN; the reader is referred to [48] and the references therein for full details of these methods. Here we summarize the global metric results for G-mean (GM) and Mean Area Under Precision-Recall Curve (MAUC-PR) that show the ICTGAN's performance.
ITCGAN outperformed the other methods on GM and MAUC-PR (Table I[ 48]). The authors also explored the effects of the Pre-Training module, the constraint provided by to and changing the fully connected and layers to convolutional layers. They found that the Pre-Training module enabled faster convergence, the constraint was essential and convolutional layers increased training duration and difficulty.

2) PAC-GAN
This was the first model to successfully generate and manipulate network traffic data (that is, ICMP Pings, DNS queries and HTTP Get Requests) at individual IP packet byte level, which was also deployed to the Internet thereby eliciting responses. Previous GAN traffic generating models only produced traffic at metadata/flow-level. In the network, consisted of six layers; two fully connected layers, a reshape layer, two deconvolution layers and an output convolutional layer. had two 2D convolutional layers, a fully connected layer, and an output linear layer for classification. Both and used 2 regularization (with a weight decay value of 2.5 × 10 -5 ), a ReLU activation function, Adam Optimization (with a learning rate of 10 −4 and beta 1 exponential decay of 0.5), and the W-Loss function (with a gradient penalty of 1.0).
The success rate in generating individual traffic types is shown in Table II [ [28]]. Although this was as high as 99% for some traffic types and 87.7% averaged over all tasks, the model could not achieve the same success rate for generating multi serial network packets from greater variety of network traffic types.

3) Flow-Based network traffic generation GAN
Five training samples were generated by IP2Vec (an input and an expected output value for each sample) from each of Source IP Address, Destination IP Address, Destination Port and Transport Protocol flows. The neural network was trained with captured flow-based network traffic, taking the value generated by IP2Vec as its input and producing the probability for each input vocabulary value, using backpropagation for learning. To reduce the backpropagation training time, IP2Vec used Negative Sampling to modify a small percentage of the weights. After training, IP2Vec ceased using the neural network and switched to employing the weights of the hidden layers as m-dimensional vector representations of the IP Addresses. The network attributes were dealt with in three ways to investigate which method produced the most realistic values.
First, network attributes were interpreted as numbers (even though they were in fact categorical). Each octet of IP addresses was transformed to continuous attributes within the interval [0, 1]. Ports were divided by the highest port number International Journal of Machine Learning and Computing, Vol. 12, No. 6, November 2022 and transformed to continuous attributes while other attributes (duration, bytes, and packets) were normalized to the interval [0, 1]. This approach was termed the Numericbased Improved WGAN (N-WGAN-GP).  Second, each octet of an IP address was mapped to an 8bit binary representation producing a 32-bit binary representation. Similarly, ports were transformed to 16-bit binary representations, while bytes and packets were transformed to binary representations limited to a length of 32-bits. The duration attribute remained normalized in [0, 1]. The technique was named the Binary-based Improved WGAN (B-WGAN-GP).
In the third approach, the Embedding-based Improved WGAN (E-WGAN-GP) involved the embedding of IP addresses, ports, duration, bytes, and packets into an mdimensional continuous feature space R. Here, each flow generated 13 training samples consisting of an input and an output value for each. Flows were then mapped to embeddings, which were re-transformed to the original space after generation. IP2Vec was used to replace values by their closest generated embeddings.
For training, Ring et al. [49] used the opensource unidirectional flow-based network traffic dataset (CIDDS-001) [54], G and D for all three methods (N-WGAN-GP, B_WGAN-GP and E-WGAN-GP) were configured to use feed-forward neural networks and trained for five Epochs. Euclidean distance was used to avoid calculation errors, especially where the probability of generated data is zero.
Results using N-WGAN-GP showed unwanted similarities between categorical values with significant errors (such as similarities in IP addresses that should be ranked as dissimilar) making it unsuitable for generating realistic flowbased network traffic. However, as shown in Table III [49], both B-WGAN-GP and E-WGAN-GP successfully generated high-quality flow-based network traffic with E-WGAN-GP achieving better evaluation results (an average of 99.83% over seven heuristic domain knowledge sanity checks) while B-WGAN-GP was able to generate previously unseen values (such as IP addresses or ports) which was not possible with E-WGAN-GP.

4) ZipNet-GAN
Here, the model was trained with Telecom Italia's Big Data Challenge publicly available real-world mobile traffic dataset, the SDG approach, and optimized using the Adam Optimizer for faster convergence, while the loss was calculated based on Euclidean distance. D and G progressed in training synchronously and the learning rate was 10 −4 . ZipNet-GAN outperformed existing Super Resolution methods for all MTSR instances as shown in Fig. 9 [51] it was evaluated for Peak Signal-to-Noise Ratio (PSNR), Normalised Root Mean Squared Error (NRMSE) and Structural Similarity Index (SSIM) and achieved 40% higher PSNR, smaller NRMSE (up to 78%) and 36.4 times higher SSIM when compared with existing SR techniques.

5) Facebook chat network traffic GAN
This GAN was tested by taking in Facebook chat flow parameters ( ), which the GAN used to train for a predefined number of epochs and then sent output to malware via Web Services. Malware traffic remained continuously active in the network and adapted its nature based on detection status and data from additional GAN training. Both and had depths of one, 128 hidden units and a sequence length of 6. Model training was via Batch Gradient Descent and the Adam optimizer with a learning rate of 10 −3 . trained for three epochs for every one epoch of . The dataset used for training were network captures (text, images, links, and documents) of Facebook chat between two users over 24 hours, converted to time series (features included network flow duration, total number of bytes in flow, calculated inter-flow time from timestamp of each flow) and used as the variable . The first objective of the model was to determine if a GAN could mimic the traffic profile of Facebook chat. The Detector was used to determine at the end of each time window if the traffic flow should be logged (fewer than three flows in the threshold), unblocked (due to no decision) or blocked (more than three flows in the threshold). As shown in Fig. 10 [52], increasing the number of epochs eventually led to no blocked flows.

6) PcapGAN
Here, the style-based took IP graph (a sparse matrix in the form × × -style vector's batch size) as its input and generated a synthetic version of this as network flow data. To generate the time image, performed a mapping of a concatenation of style vector (instead of latent space) and the intermediate vector ( ). The layer sequence was encoded as sequential data using the SeqGAN model [55] which also customized the model to create the sequential data labelled with the style vector (for example, the input style vector). Option data (a sequence of identical numbers) was augmented to both sequential data (using SeqGAN) and labelled sequential data (using any simple model). The Decoder received the generated IP graph and time image, the layer sequence, and the option data and used them to create a Pcap file in three steps. Layer sequences were converted into combinations of protocols and then, packet data was created for each protocol using the option data. The final step was randomly setting the start time for the first packet of each edge, using the time interval information of the time image to set the reception time of the other packets, then chronologically sort the generated packets at each edge of the IP graph before transforming it into a Pcap file.
PcapGAN augmented a cyber-attack dataset (GTISC) [56] with a model pre-trained with a normal dataset (MACCDC 2012) [57], then converted the initial datasets (original GTISK and MACCDC 2012) and the generated (augmented) data into KDD format via the KDD 99 extractor [58] for applying to an Intrusion Detection Algorithm (IDA). Converted MACCDC data, GTISC data and generated data were labelled data A, data B and data C, respectively. The datasets were experimented on by transforming string data into integers, normalizing them, and then using sklearn algorithms [59] to calculate accuracy, precision, recall and score values (a weighted average of the precision and recall). The results showed consistent accuracy for similarity at 0.5 (showing that the IDA was not able to distinguish between original data and distinguished data). A further test using a classification model was conducted to distinguish between the original data and the generated data and showed that the performance of each IDA improved by 2% to 4% as shown in Fig. 11 [53]. Fig. 11. Result of IDA on classification models RES1 (distinguishing data A and data B) and RES2 is result by IDA model 2 (distinguishing data A and data C). RES2 performance shows that the generated GTISK dataset is valid [53].

IV. DISCUSSION
Despite the progress recorded in other fields, GANs are only just entering the realm of traffic generation. As discussed in the previous sections and shown in Table IV, it can be said that this process has met with successes in some instances.
ZipNet-GAN was only tailored to mobile traffic inference and pattern analysis, and not to generating traffic flows. Although PcapGAN successfully generated high quality cyber data (particularly pcap files), this was only for analysis of network flow graph and timestamps. A rate of unblocking actions greater than 63% using the Facebook Chat Network Traffic GAN method showed that GANs could be successfully deployed to mimic Facebook traffic flows.
Unlike the other GAN models reviewed, only limited data are required for training the model, and it was successfully implemented using the stratosphere behavioural IPS in a router to block traffic that was not similar to Facebook chat traffic. However, the framework involved separate deployment of web services to facilitate communication and other types of network traffic were not tested.
The Flow-Based Network Traffic Generation GAN training was only implemented for single flow-based network traffic. However, the model showed sufficient potential to indicate that further studies could achieve training to generate International Journal of Machine Learning and Computing, Vol. 12, No. 6, November 2022 sequences of traffic flows. The PAC-GAN model revealed the potential that GANs have for network traffic flow generation and the possibility of extending research to cover multi-serial network packets for multi-variant traffic flow types of generation especially for large scale traffic and when incorporating RNNs as a hybrid with GANs. Imbalanced traffic was addressed successfully by ITCGAN to emphasize the true potential of GANs for realistic network traffic generation.
We would thus contend that even though network traffic generation using GANs has achieved mixed and varying success levels as shown in Table IV, further research, improvements on the model architectures and training can produce results exceeding the successes recorded to date.

V. CONCLUSIONS
Network traffic generation methods, such as Poisson models, only worked well for simple network applications but were inconsistent with complex network traffic flows. Generation models utilizing self-similar traffic solved the consistency issues associated with Poisson models but were not able to reflect the true characteristics of network flows. Methods used to generate traffic based on characteristic analysis such as Harpoon, flow-level matrix, Multi thread simulation and interdomain traffic simulation were only able to generate traffic at the flow-level. This gave rise to Plab and Swing that achieved packet-level generation but could not define traffic characteristics according to the distributions that they should follow nor to the number of characteristics to be considered. Application protocol-based traffic generation models were successfully implemented to generate and simulate network traffic that resembled the original network traffic. This was a significant achievement compared to previous generation levels, even though they could only produce traffic for particular application protocols. Efforts to produce more realistic synthetic traffic flows have led to the employment of GANs.
ITCGAN, PAC-GAN, Flow-based traffic generation GAN, Facebook Chat GAN, ZipNet GAN and PcapGAN are among the GAN models that have been used to generate traffic flows. ZipNet GAN, PcapGAN and Facebook Chat GAN have been implemented for different purposes. These are, respectively, inferring and analysing traffic patterns; generating Pcap files, and network flow graph and timestamp analysis; mimicking traffic flow capture. The flow-based traffic generation GAN achieved metadata level traffic generation for single flows only. Nevertheless, PAC-GAN successfully generated network traffic flows at the packet byte level thereby showing that GANs can generate traffic flows beyond the flow-based level. Further research is recommended into the generation of a variety of traffic flows at the packet byte level, as well as sequences of traffic flows. ITCGAN further introduced a new direction to show the ability of GANs to address the common data imbalance problem in network traffic flows while generating high quality network traffic data. Thus, when compared with previous methods, it is evident that GANs have exceeded existing state-of-the-art in network traffic flow generation hence inspiring further research in this area.
paper; M. S. L. added material and edited the work to produce the final version; both authors approved the final version.