Predicting extreme winds (i.e. winds speed equal to or greater than 25 m/s), is essential to predict wind power and accomplish safe and efficient management of wind farms. Although feasible, predicting extreme wind with supervised classifiers and deep learning models is particularly difficult because of the low frequency of these events, which leads to highly unbalanced training datasets. To tackle this challenge, in this paper different traditional data augmentation techniques, such as random oversampling, SMOTE, time series data warping and multidimensional data warping, are used to generate synthetic samples of extreme wind and its predictors, such as previous samples of wind speed and meteorological variables of the surroundings. Results show that using data augmentation techniques with the right oversampling ratio leads to improvement in extreme wind prediction with most machine learning and deep learning models tested. In this paper, advanced data augmentation techniques, such as Variational Autoencoders (VAE), are also applied and evaluated when inputs are time series.