AIC stands for Akaike’s Information Criteria and BIC known for Bayesian Information Criteria. Akaike’s Information Criteria was formed in 1973 on the other hand Bayesian Information Criteria in 1978. When comparing the Bayesian Information Criteria and the Akaike’s Information Criteria, penalty for additional parameters is more in BIC than AIC.
Note: where n is the number of data points in your training set, k is the number of parameters in the model, and L^ML^M is the maximized likelihood of a model MM.
And we’re trying to minimize these over models with varying number of parameters, so the only difference is in taking the logs vs 2k.