Optimal search on Bayesian network structure is known as an NP-hard problem and the applicability of existing optimal algorithms is limited in small Bayesian networks with 30 nodes or so. To learn larger Bayesian networks from observational data, some heuristic algorithms were used, but only a local optimal structure is found and its accuracy is not high in many cases. In this paper, we review optimal search algorithms in a constraint search space; The skeleton of the learned Bayesian network is a sub-graph of the given undirected graph called super-structure. The introduced optimal search algorithm can learn Bayesian networks with several hundreds of nodes when the degree of super-structure is around four. Numerical experiments indicate that constraint optimal search outperforms state-of-the-art heuristic algorithms in terms of accuracy, even if the super-structure is also learned by data.