Low-Level Virtual Machine (LLVM) compiler infrastructure is a useful tool for building Just-in-time (JIT) compilers, besides its reliable front-end represented by clang compiler and its elaborated middle-end containing different optimizations that improve the runtime performance. This paper addresses specifically the part of building a JIT compiler using LLVM with the scope of getting the hardware architecture details of the underlying machine such as the number of cores and the number of logical cores per processing unit and providing them to NUMA-BTLP static thread classification algorithm and to NUMA-BTDM static thread mapping algorithm. Afterwards, the hardware-aware algorithms are run by the JIT compiler within an optimization pass. JIT compiler in this paper is designed to run on a parallel C/C++ application (which creates threads using Pthreads), before the first time the application is executed on a machine. To do that, the JIT compiler takes the native code of the application, gets the corresponding LLVM IR (Intermediate Representation) for the native code and executes the hardware-aware thread classification and the thread mapping algorithms on the IR. The NUMA-Balanced Task and Loop Parallelism (NUMA-BTLP) and NUMA-Balanced Thread and Data Mapping (NUMA-BTDM) are expected to optimize the energy consumption up to 15%, on NUMA systems.