TABLE OF CONTENTS
- ABINIT/m_gwr
- m_gwr/box2gsph
- m_gwr/check_scf_cyle
- m_gwr/desc_copy
- m_gwr/desc_free
- m_gwr/desc_get_vc_sqrt
- m_gwr/desc_init
- m_gwr/desc_t
- m_gwr/desc_to_scbox
- m_gwr/est_print
- m_gwr/est_t
- m_gwr/estimate
- m_gwr/get_1d_sc_phases
- m_gwr/gsph2box
- m_gwr/gwr_build_chi0_head_and_wings
- m_gwr/gwr_build_green
- m_gwr/gwr_build_sigmac
- m_gwr/gwr_build_sigxme
- m_gwr/gwr_build_tchi
- m_gwr/gwr_build_wc
- m_gwr/gwr_cos_transform
- m_gwr/gwr_free
- m_gwr/gwr_gamma_gw
- m_gwr/gwr_get_gkbz_rpr_pm
- m_gwr/gwr_get_myk_green_gpr
- m_gwr/gwr_get_myq_wc_gpr
- m_gwr/gwr_get_u_ngfft
- m_gwr/gwr_get_wc_rpr_qbz
- m_gwr/gwr_gk_to_scbox
- m_gwr/gwr_init
- m_gwr/gwr_load_kcalc_wfd
- m_gwr/gwr_malloc_free_mats
- m_gwr/gwr_ncwrite_tchi_wc
- m_gwr/gwr_print
- m_gwr/gwr_print_mem
- m_gwr/gwr_print_trace
- m_gwr/gwr_read_ugb_from_wfk
- m_gwr/gwr_redistrib_gt_kibz
- m_gwr/gwr_redistrib_mats_qibz
- m_gwr/gwr_rotate_gpm
- m_gwr/gwr_rotate_wc
- m_gwr/gwr_rpa_energy
- m_gwr/gwr_rpr_to_ggp
- m_gwr/gwr_run_chi0
- m_gwr/gwr_run_energy_scf
- m_gwr/gwr_run_g0w0
- m_gwr/gwr_t
- m_gwr/gwr_wcq_to_scbox
- m_gwr/sc_sum
- m_gwr/sig_braket_ur
- m_gwr/write_notations
ABINIT/m_gwr [ Modules ]
NAME
m_gwr
FUNCTION
Objects and procedures implementing the GW method in real-space and imaginary time.
NOTES
Memory and workload are distributed using a 4D cartesian grid: (g/r, tau, k-points, spin). Inside the g/r communicator, we use PBLAS matrices to store G, tchi and W using a 1D processor grid and block distribution along columns. A 2D grid, indeed, would require MPI-FFT or some communication before performing the FFTs along columns. Let's assume for simplicity that we have only two MPI procs in the g/r communicator. Matrices in (g,g') space are distributed along columns so that the g-index is local and we can use sequential zero-padded FFTs to transform from g to r in the unit cell: g'-axis |-------------------- | | | g-axis | P0 | P1 | | | | |-------------------- The results of the FFT transform along g are stored in another PBLAS matrix with the same layout: g'-axis |-------------------- | | | r-axis | P0 | P1 | | | | |-------------------- At this point, we call ptrans to MPI transpose the (r, g') matrix, and we end up with: r-axis |-------------------- | | | g'-axis | P0 | P1 | | | | |-------------------- Differences with respect to the quartic GW code formulated in frequency-domain (real axis) - in GWR, the k-mesh must be Gamma-centered. - All the two-point functions are defined on k/q-centered g-spheres while GW uses a single Gamma-centered sphere. - The frequency/tau meshes are automatically defined by ntau and the KS spectrum (minimax meshes) Technical problems: - it's not clear to me that one can use vc(Sq, SG) when a cutoff is used as the cutoff breaks the spherical symmetry of vc(r). Besides, when symmetries are used to reconstruct the term for q in the BZ, one might have to take into account umklapps. Use cache? - Treatment of the anisotropic behaviour of Wc. This part is badly coded in GW, in the sense that we use a finite small q when computing Wc for q --> 0. This breaks the symmetry of the system and QP degeneracies. The equations needed to express the angular dependency of W(q) for q --> 0 are well known but one has to pass through the Adler-Wiser expression. Possible solution: Compute heads and wings using a WFK_fine wavefunction file with dense k-mesh and less bands. The dipole matrix elements are computed with the DFPT routines, still we need to recode a lot of stuff that is already done in cchi0q0, especially symmetries. Note, however, that tchi is Hermitian along the imaginary axis, expect for omega = 0 in metals but I don't think the minmax grids contain omega = 0. - In principle, it's possible to compute QP correction along a k-path if a new WFK file is provided. The correlated part is evaluated in real-space in the super-cell. For Sigma_x, we need a specialized routine that can handle arbitrary q, especially at the level of v(q, G) but I don't know if this approach will give smooth bands as we don't have q --> 0 when k does not belong to the k-mesh. - New routine to compute oscillator matrix elements with NC/PAW and PBLAS matrices. It can be used to compute tchi head/wings as well as Sigma_x + interface with coupled-cluster codes. - Decide whether we should use VASP conventions for G and the analytic continuation or the "standard" ones by Godby. The standard ones are consistent with Hedin's notations and correspond to the ones used in the legacy GW code. On the other hand, VASP notations make life easier if one has to implement PAW as all the equations have been already derived. - Address nspinor = 2 and PBLAS distribution as MPI proc can have both spinors in memory In other words, we should store the first/last index in gvec for each spinor - Optimization for Gamma-only. Memory and c -> r FFTs - Need to extend FFT API to avoid scaling if isign = -1. Also fft_ug and fft_ur should accept isign optional argument. Refactoring of all the FFT routines used in the GW code is needed in order to exploit R2C, C2R (e.g. chi0(q=0) and GPU version. - Use round-robin distribution instead of blocked-distribution to improve load balance? - Memory peaks: (env3.9) [magianto@uan01 /scratch/project_465000061/magianto/DDIAGO_ZnO] $~/git_repos/abinit/tests/Scripts/abimem.py peaks abimem_rank0.mocc [0] <var=gt_scbox, A@m_gwr.F90:3395, addr=0x14aa53673010, size_mb=379.688> [1] <var=xsum, A@xmpi_sum.finc:2551, addr=0x14aa2fce9010, size_mb=379.688> [2] <var=gt_scbox, A@m_gwr.F90:4338, addr=0x14aa4f64f010, size_mb=379.688> [3] <var=allcg_k, A@m_wfd.F90:4631, addr=0x14aa56b57010, size_mb=217.865> [5] <var=wct_scbox, A@m_gwr.F90:4339, addr=0x14aa43876010, size_mb=189.844> [6] <var=xsum, A@xmpi_sum.finc:2476, addr=0x14aa31bb0010, size_mb=189.844> [7] <var=cg_k, A@m_wfd.F90:4623, addr=0x14aa64535010, size_mb=108.932>
TODO
- Remove cryst%timrev, use kptopt and qptopt - Sig_c breaks QP degeneracies due to fixed q0. NOTES: 1) _slk_mat_t is a CPP macro defined in abi_common.h that allows us to use PBLAS in single/double precision Be careful when using c_f_pointer because there's no type checking.
COPYRIGHT
Copyright (C) 1999-2024 ABINIT group (MG) This file is distributed under the terms of the GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt .
SOURCE
120 #if defined HAVE_CONFIG_H 121 #include "config.h" 122 #endif 123 124 #include "abi_common.h" 125 126 module m_gwr 127 128 use defs_basis 129 use m_abicore 130 use m_errors 131 !use mpi 132 use m_xmpi 133 use m_xomp 134 use m_hdr 135 use m_ebands 136 use netcdf 137 use m_nctk 138 use m_dtfil 139 use m_yaml 140 use m_sigtk 141 use, intrinsic :: iso_c_binding 142 use m_hide_blas 143 144 use defs_datatypes, only : pseudopotential_type, ebands_t 145 use defs_abitypes, only : mpi_type 146 use m_gwdefs, only : GW_TOL_DOCC, GW_TOLQ0, GW_TOL_W0, GW_Q0_DEFAULT, cone_gw, czero_gw, j_gw, sigijtab_t, & 147 sigijtab_free, g0g0w 148 use m_time, only : cwtime, cwtime_report, sec2str, timab 149 use m_io_tools, only : iomode_from_fname, get_unit, file_exists, open_file, write_units 150 use m_pstat, only : pstat_t 151 use m_numeric_tools, only : blocked_loop, get_diag, isdiagmat, arth, print_arr, imin_loc, imax_loc, & 152 c2r, linfit, bisect, hermitianize 153 use m_copy, only : alloc_copy 154 use m_geometry, only : normv, vdotw 155 use m_fstrings, only : sjoin, itoa, strcat, ktoa, ltoa, ftoa, string_in, yesno 156 use m_sort, only : sort_dp, sort_rvals 157 use m_krank, only : krank_t, krank_new, krank_from_kptrlatt, get_ibz2bz, star_from_ibz_idx 158 use m_crystal, only : crystal_t 159 use m_dtset, only : dataset_type 160 use m_fftcore, only : get_kg, sphereboundary, getng, print_ngfft, fftcore_set_mixprec, ngfft_seq 161 use m_cgtk, only : cgtk_rotate 162 use m_mpinfo, only : initmpi_seq, destroy_mpi_enreg 163 use m_distribfft, only : init_distribfft_seq 164 use m_kg, only : getcut 165 use m_fft, only : fftbox_plan3_t, uplan_t, fft_ug, fft_ur, fourdp 166 use m_fft_mesh, only : calc_ceikr, calc_ceigr, ctimes_eikr 167 use m_kpts, only : kpts_ibz_from_kptrlatt, kpts_timrev_from_kptopt, kpts_map, kpts_map_print, kpts_pack_in_stars 168 use m_bz_mesh, only : littlegroup_t, findqg0 169 use m_gsphere, only : kg_map, gsphere_t 170 use m_melemts, only : melements_t 171 use m_ioarr, only : fftdatar_write, read_rhor 172 use m_slk, only : matrix_scalapack, slkmat_sp_t, processor_scalapack, slk_array_free, slk_array_set, & 173 slk_array_locmem_mb, block_dist_1d, slk_pgemm 174 use m_wfk, only : wfk_read_ebands, wfk_t, wfk_open_read 175 use m_wfd, only : wfd_init, wfd_t, wfdgw_t 176 use m_ddk, only : ddkop_t, ddkop_new 177 use m_pawtab, only : pawtab_type 178 use m_pawcprj, only : pawcprj_type 179 use m_vcoul, only : vcgen_t 180 use m_vkbr, only : vkbr_t, vkbr_free, vkbr_init, nc_ihr_comm 181 use m_chi0tk, only : chi0_bbp_mask, accumulate_head_wings_imagw, symmetrize_afm_chi0 182 use m_sigx, only : sigx_symmetrize 183 use m_dyson_solver, only : sigma_pade_t 184 !#ifdef __HAVE_GREENX 185 use minimax_grids, only : gx_minimax_grid !, gx_get_error_message 186 !#endif 187 188 implicit none 189 190 private
m_gwr/box2gsph [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
box2gsph
FUNCTION
Extract cg_k array defined on the k-centered g-sphere with npw vectors from the FFT box.
INPUTS
ngfft: n1,n2,n3=physical dimension of the FFT box n4,n5,n6=memory dimension of cfft npw=number of G vectors in basis at this k point ndat=number of items to process kg_k(3,npw)=integer coordinates of G vectors in basis sphere cfft(n4,n5,n6, ndat) = array on FFT box
OUTPUT
cg(npw*ndat)= contains values for npw G vectors in basis sphere
SOURCE
6803 subroutine box2gsph(ngfft, npw, ndat, kg_k, cfft, cg) 6804 6805 !Arguments ------------------------------------ 6806 !scalars 6807 integer,intent(in) :: ngfft(6), npw, ndat 6808 !arrays 6809 integer,intent(in) :: kg_k(3, npw) 6810 complex(gwpc),target,intent(in) :: cfft(ngfft(4)*ngfft(5)*ngfft(6)*ndat) 6811 complex(gwpc),intent(out) :: cg(npw*ndat) 6812 6813 !Local variables------------------------------- 6814 integer :: n1, n2, n3, n4, n5, n6, i1, i2, i3, idat, ipw, icg 6815 complex(gwpc),contiguous,pointer :: cfft_ptr(:,:,:,:) 6816 !character(len=500) :: msg 6817 6818 ! ************************************************************************* 6819 6820 n1 = ngfft(1); n2 = ngfft(2); n3 = ngfft(3) 6821 n4 = ngfft(4); n5 = ngfft(5); n6 = ngfft(6) 6822 call c_f_pointer(c_loc(cfft), cfft_ptr, shape=[n4, n5, n6, ndat]) 6823 6824 ! Extract cg from cfft, ignoring components outside range of cg sphere 6825 !$OMP PARALLEL DO PRIVATE(i1, i2, i3, icg) IF (ndat > 1) 6826 do idat=1,ndat 6827 do ipw=1,npw 6828 i1 = modulo(kg_k(1, ipw), n1) + 1 6829 i2 = modulo(kg_k(2, ipw), n2) + 1 6830 i3 = modulo(kg_k(3, ipw), n3) + 1 6831 !if (any(kg_k(:,ipw) > ngfft(1:3)/2) .or. any(kg_k(:,ipw) < -(ngfft(1:3)-1)/2) ) then 6832 ! write(msg,'(a,3(i0,1x),a)')" The G-vector: ",kg_k(:, ipw)," falls outside the FFT box. Increase boxcutmin (?)" 6833 ! ABI_ERROR(msg) 6834 !end if 6835 icg = ipw + (idat - 1) * npw 6836 cg(icg) = cfft_ptr(i1, i2, i3, idat) 6837 end do 6838 end do 6839 6840 end subroutine box2gsph
m_gwr/check_scf_cyle [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
check_scf_cycle
FUNCTION
INPUTS
OUTPUT
SOURCE
6507 subroutine gwr_check_scf_cycle(gwr, converged) 6508 6509 !Arguments ------------------------------------ 6510 class(gwr_t),intent(in) :: gwr 6511 logical,intent(out) :: converged 6512 6513 !Local variables------------------------------- 6514 integer,parameter :: master = 0 6515 integer :: spin, ikcalc, ik_ibz, band, ib, jb 6516 character(len=500) :: msg 6517 real(dp) :: max_adiff, adiff(gwr%qp_ebands%mband) 6518 integer :: units(2) 6519 6520 ! ************************************************************************* 6521 6522 max_adiff = -one; converged = .True.; units = [std_out, ab_out] 6523 6524 if (gwr%comm%me == master) then 6525 call wrtout(units, sjoin(" Checking for convergence at iteration:", itoa(gwr%scf_iteration))) 6526 end if 6527 6528 associate (now => gwr%qp_ebands, prev => gwr%qp_ebands_prev) 6529 do spin=1,gwr%nsppol 6530 do ikcalc=1,gwr%nkcalc ! TODO: Should be spin dependent! 6531 ! Compute max abs difference between QP at iteration i and i-1. 6532 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 6533 ib = gwr%bstart_ks(ikcalc, spin); jb = gwr%bstop_ks(ikcalc, spin) 6534 adiff = zero; adiff(ib:jb) = abs(now%eig(ib:jb, ik_ibz, spin) - prev%eig(ib:jb, ik_ibz, spin)) 6535 band = maxloc(adiff, dim=1) 6536 max_adiff = max(max_adiff, adiff(band)) 6537 if (adiff(band) > gwr%dtset%gwr_tolqpe) converged = .False. 6538 if (gwr%comm%me == master) then 6539 ! Write info 6540 write(msg, "(a,i0,1x,2a,i0)") " For k-point: ", ik_ibz, trim(ktoa(now%kptns(:,ik_ibz))),", spin: ", spin 6541 call wrtout(units, msg) 6542 write(msg, "(4x,a,es12.5,a,i0)")"max(abs(E_i - E_{i-1})): ", adiff(band) * Ha_meV, " (meV) for band: ", band 6543 call wrtout(units, msg) 6544 end if 6545 end do 6546 end do 6547 end associate 6548 6549 ! Make sure that all MPI procs agree on this! 6550 call xmpi_land(converged, gwr%comm%value) 6551 6552 if (gwr%comm%me == master) then 6553 write(msg, "(a,i0,a)") "QP gaps at iteration: ",gwr%scf_iteration," (Fermi energy set to zero)" 6554 call ebands_print_gaps(gwr%qp_ebands, std_out, header=msg) 6555 call ebands_print_gaps(gwr%qp_ebands, ab_out, header=msg) 6556 if (.not. converged) then 6557 call wrtout(units," Not converged --> start new iteration ...") 6558 !else 6559 ! call wrtout(units, sjoin(" Convergence achieved at iteration", itoa(gwr%scf_iteration))) 6560 end if 6561 ! TODO: Incremente scf_interation in GWR.nc 6562 end if 6563 6564 end subroutine gwr_check_scf_cycle
m_gwr/desc_copy [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
desc_copy
FUNCTION
Copy object NB: cannot use obj1 = obj2 syntax because ABINIT memory-leak detector won't see the allocation automatically performed by the compiler.
SOURCE
3764 subroutine desc_copy(in_desc, new_desc) 3765 3766 !Arguments ------------------------------------ 3767 class(desc_t),intent(in) :: in_desc 3768 class(desc_t),intent(out) :: new_desc 3769 3770 ! ************************************************************************* 3771 3772 call new_desc%free() 3773 3774 new_desc%istwfk = in_desc%istwfk 3775 new_desc%npw = in_desc%npw 3776 new_desc%ig0 = in_desc%ig0 3777 new_desc%kin_sorted = in_desc%kin_sorted 3778 3779 call alloc_copy(in_desc%gvec, new_desc%gvec) 3780 call alloc_copy(in_desc%gbound, new_desc%gbound) 3781 if (allocated(in_desc%vc_sqrt)) call alloc_copy(in_desc%vc_sqrt, new_desc%vc_sqrt) 3782 3783 if (allocated(in_desc%g2box)) then 3784 call alloc_copy(in_desc%g2box, new_desc%g2box) 3785 new_desc%cached_sc_ngfft = in_desc%cached_sc_ngfft 3786 end if 3787 3788 end subroutine desc_copy
m_gwr/desc_free [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
desc_free
FUNCTION
Free memory
SOURCE
3885 subroutine desc_free(desc) 3886 3887 !Arguments ------------------------------------ 3888 class(desc_t),intent(inout) :: desc 3889 ! ************************************************************************* 3890 3891 ABI_SFREE(desc%gvec) 3892 ABI_SFREE(desc%gbound) 3893 ABI_SFREE(desc%vc_sqrt) 3894 ABI_SFREE(desc%g2box) 3895 desc%cached_sc_ngfft = -1 3896 3897 end subroutine desc_free
m_gwr/desc_get_vc_sqrt [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
desc_get_vc_sqrt
FUNCTION
Compute square root of the Coulomb interaction vc(q,g).
SOURCE
3732 subroutine desc_get_vc_sqrt(desc, qpt, q_is_gamma, gwr, comm) 3733 3734 !Arguments ------------------------------------ 3735 class(desc_t),intent(inout) :: desc 3736 real(dp),intent(in) :: qpt(3) 3737 logical, intent(in) :: q_is_gamma 3738 class(gwr_t),intent(in) :: gwr 3739 integer,intent(in) :: comm 3740 3741 ! ************************************************************************* 3742 3743 ABI_UNUSED([q_is_gamma]) 3744 if (allocated(desc%vc_sqrt)) return 3745 ABI_MALLOC(desc%vc_sqrt, (desc%npw)) 3746 call gwr%vcgen%get_vc_sqrt(qpt, desc%npw, desc%gvec, gwr%q0, gwr%cryst, desc%vc_sqrt, comm) 3747 3748 end subroutine desc_get_vc_sqrt
m_gwr/desc_init [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
desc_init
FUNCTION
Initialize the descriptor.
INPUTS
OUTPUT
SOURCE
3688 subroutine desc_init(desc, kk, istwfk, ecut, gwr, kin_sorted) 3689 3690 !Arguments ------------------------------------ 3691 class(desc_t),intent(inout) :: desc 3692 real(dp),intent(in) :: kk(3) 3693 integer,intent(in) :: istwfk 3694 real(dp),intent(in) :: ecut 3695 class(gwr_t),intent(in) :: gwr 3696 logical,optional,intent(in) :: kin_sorted 3697 3698 !Local variables------------------------------- 3699 integer :: ig 3700 3701 ! ************************************************************************* 3702 3703 desc%kin_sorted = .False.; if (present(kin_sorted)) desc%kin_sorted = kin_sorted 3704 desc%istwfk = istwfk 3705 call get_kg(kk, desc%istwfk, ecut, gwr%cryst%gmet, desc%npw, desc%gvec, kin_sorted=desc%kin_sorted) 3706 3707 ABI_MALLOC(desc%gbound, (2 * gwr%g_mgfft + 8, 2)) 3708 call sphereboundary(desc%gbound, desc%istwfk, desc%gvec, gwr%g_mgfft, desc%npw) 3709 3710 ! Find the index of g = 0. 3711 desc%ig0 = -1 3712 do ig=1,desc%npw 3713 if (all(desc%gvec(:,ig) == 0)) then 3714 desc%ig0 = ig; exit 3715 end if 3716 end do 3717 3718 end subroutine desc_init
m_gwr/desc_t [ Types ]
NAME
desc_t
FUNCTION
Parameters related to a two-point function such as gvectors, tables used for zero padded FFTs and matrix elements of the Coulomb interaction.
SOURCE
204 type,public :: desc_t 205 206 integer :: istwfk = 1 207 ! Storage mode for this k/q point. 208 209 integer :: npw = -1 210 ! Total number of plane-waves for this k/q-point. 211 212 integer :: ig0 = -1 213 ! Index of g=0 in gvec. 214 215 logical :: kin_sorted 216 ! True if gvec are sorted by |q+g|^2/2 217 218 integer,allocatable :: gvec(:,:) 219 ! (3, npw) 220 ! G-vectors in reduced coordinates. 221 ! Note that this array is global i.e. it is not MPI-distributed inside the PBLAS communicator. 222 223 integer,allocatable :: gbound(:,:) 224 ! (2*mgfft+8, 2) 225 ! sphere boundary info for zero-padded FFT 226 227 integer,allocatable :: g2box(:) 228 ! (npw) 229 ! Index of gvec in the supercell FFT box. 230 231 integer :: cached_sc_ngfft(6) = -1 232 233 complex(gwpc),allocatable :: vc_sqrt(:) 234 ! (npw) 235 ! Square root of the Coulomb interaction in reciprocal space. 236 ! Allocated and computed for tchi/W descriptors. 237 ! A cutoff might be applied. 238 239 contains 240 241 procedure :: init => desc_init 242 ! Initialize the object 243 244 procedure :: copy => desc_copy 245 ! Copy object. 246 247 procedure :: to_scbox => desc_to_scbox 248 ! Copy object. 249 250 procedure :: get_vc_sqrt => desc_get_vc_sqrt 251 ! Compute square root of vc(q,g). 252 253 procedure :: free => desc_free 254 ! Free memory. 255 256 end type desc_t 257 258 interface desc_array_free 259 module procedure desc_array1_free 260 end interface desc_array_free
m_gwr/desc_to_scbox [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
desc_to_scbox
FUNCTION
Insert cg_k array defined on the k-centered g-sphere with npw vectors inside the FFT box. The main difference wrt to sphere is that cfft is not initialized to zero. See notes below.
INPUTS
sc_ngfft: n1,n2,n3=physical dimension of the FFT box n4,n5,n6=memory dimension of cfft npw=number of G vectors in basis at this k point ndat=number of items to process cg(npw*ndat)= contains values for npw G vectors in basis sphere
OUTPUT
cfft(n4,n5,n6*ndat) = array on FFT box filled with cg data Note that cfft is intent(inout) so that we can add contributions from different k-points.
SOURCE
3815 subroutine desc_to_scbox(desc, kk, ngkpt, sc_ngfft, ndat, cg, cfft) 3816 3817 !Arguments ------------------------------------ 3818 !scalars 3819 class(desc_t),intent(inout) :: desc 3820 real(dp),intent(in) :: kk(3) 3821 integer,intent(in) :: ngkpt(3) 3822 integer,intent(in) :: sc_ngfft(6), ndat 3823 !arrays 3824 complex(gwpc),intent(in) :: cg(desc%npw, ndat) 3825 complex(gwpc),intent(inout) :: cfft(sc_ngfft(4)*sc_ngfft(5)*sc_ngfft(6),ndat) 3826 3827 !Local variables------------------------------- 3828 integer :: n1, n2, n3, n4, n5, n6, i1, i2, i3, idat, ipw, kg(3), gg(3), ifft 3829 logical :: compute_mapping 3830 !real(dp) :: tsec(2) !, cpu, wall, gflops 3831 !character(len=500) :: msg 3832 3833 ! ************************************************************************* 3834 3835 !call timab(1931, 1, tsec) 3836 3837 n1 = sc_ngfft(1); n2 = sc_ngfft(2); n3 = sc_ngfft(3) 3838 n4 = sc_ngfft(4); n5 = sc_ngfft(5); n6 = sc_ngfft(6) 3839 gg = nint(kk * ngkpt) 3840 3841 compute_mapping = .not. allocated(desc%g2box) .or. any(desc%cached_sc_ngfft /= sc_ngfft(1:6)) 3842 3843 ! FIXME This is not thread safe 3844 if (compute_mapping) then 3845 ABI_REMALLOC(desc%g2box, (desc%npw)) 3846 desc%cached_sc_ngfft = sc_ngfft(1:6) 3847 do ipw=1,desc%npw 3848 kg = gg + ngkpt * desc%gvec(:,ipw) ! k+g 3849 i1 = modulo(kg(1), n1) !+ 1 3850 i2 = modulo(kg(2), n2) !+ 1 3851 i3 = modulo(kg(3), n3) !+ 1 3852 desc%g2box(ipw) = 1 + i1 + n4*(i2+i3*n5) 3853 end do 3854 end if 3855 3856 ! Insert cg into cfft 3857 !$OMP PARALLEL DO PRIVATE(i1, i2, i3) IF (ndat > 1) 3858 do idat=1,ndat 3859 do ipw=1,desc%npw 3860 !if (any(kg_k(:,ipw) > sc_ngfft(1:3)/2) .or. any(kg_k(:,ipw) < -(sc_ngfft(1:3)-1)/2) ) then 3861 ! write(msg,'(a,3(i0,1x),a)')" The G-vector: ",kg_k(:, ipw)," falls outside the FFT box. Increase boxcutmin (?)" 3862 ! ABI_ERROR(msg) 3863 !end if 3864 ifft = desc%g2box(ipw) 3865 cfft(ifft,idat) = cg(ipw,idat) 3866 end do 3867 end do 3868 3869 !call timab(1931, 2, tsec) 3870 3871 end subroutine desc_to_scbox
m_gwr/est_print [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
est_print
FUNCTION
SOURCE
1760 subroutine est_print(est, units) 1761 1762 !Arguments ------------------------------------ 1763 class(est_t), intent(in) :: est 1764 integer,intent(in) :: units(:) 1765 1766 !Local variables------------------------------- 1767 character(len=4),parameter :: fmt = "f8.1" 1768 1769 ! ************************************************************************* 1770 1771 call wrtout(units, "- Resident memory in Mb for G(g,g',+/-tau) and chi(g,g',tau):") 1772 call wrtout(units, sjoin("- G_k(g,g,tau): ", ftoa(est%mem_green_gg, fmt=fmt))) 1773 call wrtout(units, sjoin("- Chi_q(g,g,tau): ", ftoa(est%mem_chi_gg, fmt=fmt))) 1774 call wrtout(units, sjoin("- u_k(g,b): ", ftoa(est%mem_ugb, fmt=fmt))) 1775 call wrtout(units, "- Temporary memory allocated inside the tau loops:") 1776 call wrtout(units, sjoin("- G_k(r,g): ", ftoa(est%mem_green_rg, fmt=fmt))) 1777 call wrtout(units, sjoin("- chi_q(r,g): ", ftoa(est%mem_chi_rg, fmt=fmt))) 1778 1779 end subroutine est_print
m_gwr/est_t [ Types ]
NAME
est_t
FUNCTION
Memory is given in Mb
SOURCE
274 type, public :: est_t 275 276 real(dp) :: mem_green_gg = zero 277 real(dp) :: mem_green_rg = zero 278 real(dp) :: mem_chi_gg = zero 279 real(dp) :: mem_chi_rg = zero 280 real(dp) :: mem_ugb = zero 281 real(dp) :: mem_total = zero 282 real(dp) :: efficiency = zero 283 real(dp) :: speedup = zero 284 285 contains 286 procedure :: print => est_print 287 end type est_t
m_gwr/estimate [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
estimate
FUNCTION
Estimate memory requirements and the parallel speedup of a given `np_kgts` configuration.
SOURCE
1695 type(est_t) pure function estimate(gwr, np_kgts) result(est) 1696 1697 !Arguments ------------------------------------ 1698 class(gwr_t),intent(in) :: gwr 1699 integer,intent(in) :: np_kgts(4) 1700 1701 !Local variables------------------------------- 1702 real(dp) :: np_k, np_g, np_t, np_s, w_k, w_g, w_t, w_s, np_tot 1703 1704 ! ************************************************************************* 1705 1706 ! Use real quantities to avoid integer division 1707 np_k = np_kgts(1); np_g = np_kgts(2); np_t = np_kgts(3); np_s = np_kgts(4) 1708 np_tot = product(real(np_kgts)) 1709 1710 ! NB: array dimensioned with nkibz and nqibz do not scale as 1/np_k as we distribute the BZ, IBZ points might be replicated. 1711 1712 ! Resident memory in Mb for G(g,g',+/-tau) and chi(g,g',tau) 1713 est%mem_green_gg = two * two * (one*gwr%nspinor*gwr%green_mpw)**2 * two*gwr%ntau * gwr%nkibz * gwr%nsppol * gwp*b2Mb / np_tot 1714 est%mem_chi_gg = two * (one*gwr%tchi_mpw)**2 * gwr%ntau * gwr%nqibz * gwp*b2Mb / (np_g * np_t * np_k) 1715 est%mem_ugb = two * gwr%green_mpw * gwr%nspinor * gwr%dtset%nband(1) * gwr%nkibz * gwr%nsppol * gwp*b2Mb / np_tot 1716 1717 ! Temporary memory allocated inside the tau loops. 1718 ! This is the chunck we have to minimize by increasing np_g and/or np_k to avoid going OOM. 1719 est%mem_green_rg = two * two * gwr%nspinor**2 * gwr%green_mpw * gwr%g_nfft * gwr%nkbz * gwr%nsppol * gwp*b2Mb / (np_g * np_k) 1720 est%mem_chi_rg = two * gwr%tchi_mpw * gwr%g_nfft * gwr%nqbz * gwp*b2Mb / (np_g * np_k) 1721 1722 est%mem_total = est%mem_green_gg + est%mem_chi_gg + est%mem_ugb + est%mem_green_rg + est%mem_chi_rg 1723 1724 ! Estimate speedup and parallel efficiency using heuristic weights. Note g_nfft instead of green_mpw. 1725 w_k = 0.799_dp; w_g = 0.899_dp; w_t = 1.1_dp; w_s = 1.2_dp 1726 1727 ! Promote kpt parallelism under particular circumstances. 1728 if (gwr%nkbz > 4**3) w_k = w_g + tol2 * merge(+1, -5, np_k < 5) 1729 1730 est%speedup = speedup(gwr%nkbz, nint(np_k), w_k) * speedup(gwr%g_nfft, nint(np_g), w_g) * & 1731 speedup(gwr%ntau, nint(np_t), w_t) * speedup(gwr%nsppol, nint(np_s), w_s) 1732 est%efficiency = est%speedup / np_tot 1733 1734 contains 1735 1736 real(dp) pure function speedup(size, np, weight) 1737 ! Expected speedup for a `size` problem and `np` processes 1738 integer,intent(in) :: size, np 1739 real(dp),intent(in) :: weight 1740 if (np == 1) then 1741 speedup = one 1742 else 1743 speedup = (weight*size) / (one* ((size / np) + merge(0, 1, mod(size, np) == 0))) 1744 end if 1745 end function speedup 1746 1747 end function estimate
m_gwr/get_1d_sc_phases [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
get_1d_sc_phases
FUNCTION
Compute one-dimensional factors in the supercell.
INPUTS
OUTPUT
SOURCE
7913 subroutine get_1d_sc_phases(sc_shape, nkpt, kpts, ph1d) 7914 7915 !Arguments ------------------------------------ 7916 integer,intent(in) :: sc_shape(3), nkpt 7917 real(dp),intent(in) :: kpts(3, nkpt) 7918 complex(gwpc),allocatable,intent(out) :: ph1d(:,:,:) 7919 7920 !Local variables------------------------------- 7921 integer :: ikpt, ix, iy, iz 7922 real(dp) :: arg, fact, kk(3) 7923 7924 ! ************************************************************************* 7925 7926 ABI_MALLOC(ph1d, (maxval(sc_shape), 3, nkpt)) 7927 7928 do ikpt=1,nkpt 7929 kk = kpts(:, ikpt) 7930 fact = two_pi * kk(1) 7931 do ix=0,sc_shape(1) - 1 7932 arg = fact * ix 7933 ph1d(ix + 1, 1, ikpt) = cmplx(cos(arg), sin(arg), kind=gwpc) 7934 end do 7935 fact = two_pi * kk(2) 7936 do iy=0,sc_shape(2) - 1 7937 arg = fact * iy 7938 ph1d(iy + 1, 2, ikpt) = cmplx(cos(arg), sin(arg), kind=gwpc) 7939 end do 7940 fact = two_pi * kk(3) 7941 do iz=0,sc_shape(3) - 1 7942 arg = fact * iz 7943 ph1d(iz + 1, 3, ikpt) = cmplx(cos(arg), sin(arg), kind=gwpc) 7944 end do 7945 end do ! ikpt 7946 7947 end subroutine get_1d_sc_phases
m_gwr/gsph2box [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gsph2box
FUNCTION
Insert cg_k array defined on the k-centered g-sphere with npw vectors inside the FFT box. The main difference wrt to sphere is that cfft is not initialized to zero. See notes below.
INPUTS
ngfft: n1,n2,n3=physical dimension of the FFT box n4,n5,n6=memory dimension of cfft npw=number of G vectors in basis at this k point ndat=number of items to process kg_k(3,npw)=integer coordinates of G vectors in basis sphere cg(npw*ndat)= contains values for npw G vectors in basis sphere
OUTPUT
cfft(n4,n5,n6*ndat) = array on FFT box filled with cg data Note that cfft is intent(inout) so that we can add contributions from different k-points.
SOURCE
6738 subroutine gsph2box(ngfft, npw, ndat, kg_k, cg, cfft) 6739 6740 !Arguments ------------------------------------ 6741 !scalars 6742 integer,intent(in) :: ngfft(6), npw, ndat 6743 !arrays 6744 integer,intent(in) :: kg_k(3, npw) 6745 complex(gwpc),intent(in) :: cg(npw * ndat) 6746 complex(gwpc),target,intent(inout) :: cfft(ngfft(4)*ngfft(5)*ngfft(6)*ndat) 6747 6748 !Local variables------------------------------- 6749 integer :: n1, n2, n3, n4, n5, n6, i1, i2, i3, idat, ipw 6750 complex(gwpc),contiguous,pointer :: cfft_ptr(:,:,:,:) 6751 !real(dp) :: tsec(2) !, cpu, wall, gflops 6752 !character(len=500) :: msg 6753 6754 ! ************************************************************************* 6755 6756 !call timab(1931, 1, tsec) 6757 n1 = ngfft(1); n2 = ngfft(2); n3 = ngfft(3) 6758 n4 = ngfft(4); n5 = ngfft(5); n6 = ngfft(6) 6759 call c_f_pointer(c_loc(cfft), cfft_ptr, shape=[n4, n5, n6, ndat]) 6760 6761 ! Insert cg into cfft 6762 !$OMP PARALLEL DO PRIVATE(i1, i2, i3) IF (ndat > 1) 6763 do idat=1,ndat 6764 do ipw=1,npw 6765 i1 = modulo(kg_k(1, ipw), n1) + 1 6766 i2 = modulo(kg_k(2, ipw), n2) + 1 6767 i3 = modulo(kg_k(3, ipw), n3) + 1 6768 !if (any(kg_k(:,ipw) > ngfft(1:3)/2) .or. any(kg_k(:,ipw) < -(ngfft(1:3)-1)/2) ) then 6769 ! write(msg,'(a,3(i0,1x),a)')" The G-vector: ",kg_k(:, ipw)," falls outside the FFT box. Increase boxcutmin (?)" 6770 ! ABI_ERROR(msg) 6771 !end if 6772 cfft_ptr(i1,i2,i3,idat) = cg(ipw+npw*(idat-1)) 6773 end do 6774 end do 6775 !call timab(1931, 2, tsec) 6776 6777 end subroutine gsph2box
m_gwr/gwr_build_chi0_head_and_wings [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_chi0_head_and_wings
FUNCTION
Compute head and wings of chi0 on the minimax frequency grid.
SOURCE
6854 subroutine gwr_build_chi0_head_and_wings(gwr) 6855 6856 !Arguments ------------------------------------ 6857 class(gwr_t),target,intent(inout) :: gwr 6858 6859 !Local variables------------------------------- 6860 !scalars 6861 integer,parameter :: two_poles = 2, one_pole = 1, gwcomp0 = 0, spmeth0 = 0 6862 integer :: nsppol, nspinor, ierr, my_is, spin, my_ikf, itau, my_it 6863 integer :: ik_bz, ik_ibz, isym_k, trev_k, g0_k(3) 6864 !integer :: iq_bz, iq_ibz, isym_q, trev_q, g0_q(3) 6865 integer :: nkpt_summed, use_umklp, band1, band2, band1_start, band1_stop, band1_max 6866 integer :: ib, il_b1, il_b2, nb, block_size, ii, mband, block_counter 6867 integer :: istwf_ki, npw_ki, istwf_kf, nI, nJ, nomega, io, iq, nq, dim_rtwg !ig, 6868 integer :: npwe, u_nfft, u_mgfft, u_mpw 6869 logical :: isirr_k, use_tr, is_metallic, print_time 6870 real(dp) :: spin_fact, weight, deltaf_b1b2, deltaeGW_b1b2, gwr_boxcutmin_c, zcut, qlen, eig_nk, e0 6871 real(dp) :: cpu_all, wall_all, gflops_all, cpu_k, wall_k, gflops_k 6872 complex(dpc) :: deltaeKS_b1b2 6873 type(__slkmat_t),pointer :: ugb_kibz 6874 character(len=5000) :: msg 6875 type(crystal_t),pointer :: cryst 6876 type(dataset_type),pointer :: dtset 6877 type(ebands_t),pointer :: now_ebands 6878 type(littlegroup_t) :: ltg_q 6879 type(desc_t),pointer :: desc_ki 6880 !arrays 6881 integer :: gmax(3), u_ngfft(18), work_ngfft(18), units(2) ! spinor_padx(2,4), g0(3), 6882 integer,contiguous, pointer :: kg_ki(:,:) 6883 integer,allocatable :: gvec_q0(:,:), gbound_q0(:,:), u_gbound(:,:) 6884 real(dp) :: kk_ibz(3), kk_bz(3), tsec(2) 6885 real(dp),contiguous, pointer :: qp_eig(:,:,:), qp_occ(:,:,:), ks_eig(:,:,:) !, cwave(:,:) 6886 real(dp),allocatable :: work(:,:,:,:), qdirs(:,:) 6887 logical :: gradk_not_done(gwr%nkibz) 6888 logical,allocatable :: bbp_mask(:,:) 6889 complex(dpc) :: chq(3) !, wng(3) 6890 !complex(dp),allocatable :: ug1_block(:,:) 6891 complex(gwpc) :: rhotwx(3, gwr%nspinor**2) !, new_rhotwx(3, gwr%nspinor**2) 6892 complex(gwpc),allocatable :: ug2(:), ur1_kibz(:), ur2_kibz(:), ur_prod(:), rhotwg(:), ug1_block(:,:), ug1(:) 6893 complex(dpc) :: green_w(gwr%ntau), omega(gwr%ntau) 6894 complex(dpc),allocatable :: chi0_lwing(:,:,:), chi0_uwing(:,:,:), chi0_head(:,:,:), head_qvals(:) 6895 real(dp), allocatable :: gh1c_block(:,:,:,:) 6896 type(vkbr_t),allocatable :: vkbr(:) 6897 type(gsphere_t) :: gsph 6898 type(ddkop_t) :: ddkop 6899 !type(pawcprj_type),allocatable :: cwaveprj(:,:) 6900 6901 ! ************************************************************************* 6902 6903 call timab(1927, 1, tsec) 6904 call cwtime(cpu_all, wall_all, gflops_all, "start") 6905 units = [std_out, ab_out] 6906 call wrtout(units, sjoin(" Computing chi0 head and wings with inclvkb:", itoa(gwr%dtset%inclvkb)), pre_newlines=1) 6907 6908 nspinor = gwr%nspinor; nsppol = gwr%nsppol; dtset => gwr%dtset; cryst => gwr%cryst 6909 use_tr = gwr%dtset%awtr == 1; zcut = gwr%dtset%zcut ! well, it's not used in g0w0 when omega is complex. 6910 6911 ! Use KS or QP energies depending on the iteration state. 6912 if (gwr%scf_iteration == 1) then 6913 call wrtout(units, " Using KS orbitals and KS energies...", newlines=1, do_flush=.True.) 6914 qp_eig => gwr%ks_ebands%eig; qp_occ => gwr%ks_ebands%occ 6915 now_ebands => gwr%ks_ebands 6916 else 6917 call wrtout(units, " Using KS orbitals and QP energies...", newlines=1, do_flush=.True.) 6918 qp_eig => gwr%qp_ebands%eig; qp_occ => gwr%qp_ebands%occ 6919 now_ebands => gwr%qp_ebands 6920 end if 6921 6922 ks_eig => gwr%ks_ebands%eig 6923 mband = gwr%ks_ebands%mband 6924 6925 is_metallic = ebands_has_metal_scheme(now_ebands) 6926 6927 ! Setup weight (2 for spin unpolarized systems, 1 for polarized). 6928 ! spin_fact is used to normalize the occupation factors to one. 6929 ! Consider also the AFM case. 6930 select case (nsppol) 6931 case (1) 6932 weight = two / gwr%nkbz; spin_fact = half 6933 if (gwr%nspden == 2) then 6934 weight = one / gwr%nkbz; spin_fact = half 6935 end if 6936 if (nspinor == 2) then 6937 weight = one / gwr%nkbz; spin_fact = one 6938 end if 6939 case (2) 6940 weight = one / gwr%nkbz; spin_fact = one 6941 case default 6942 ABI_BUG(sjoin("Wrong nsppol:", itoa(nsppol))) 6943 end select 6944 6945 ! TODO: Replace vkbr with ddk and factorize calls to DDK |bra> 6946 ABI_MALLOC(vkbr, (gwr%nkibz)) 6947 gradk_not_done = .TRUE. 6948 6949 ! TODO: Might become 1b 6950 ABI_MALLOC(bbp_mask, (mband, mband)) 6951 6952 ! ========================================= 6953 ! Find FFT mesh and max number of g-vectors 6954 ! ========================================= 6955 ! TODO: Can be decreased. Consider also fftgw 6956 gwr_boxcutmin_c = two 6957 !gwr_boxcutmin_c = one 6958 call gwr%get_u_ngfft(gwr_boxcutmin_c, u_ngfft, u_nfft, u_mgfft, u_mpw, gmax) 6959 6960 ! Init work_ngfft 6961 gmax = gmax + 4 ! FIXME: this is to account for umklapp, should also consider Gamma-only and istwfk 6962 gmax = 2 * gmax + 1 6963 call ngfft_seq(work_ngfft, gmax) 6964 !write(std_out,*)"work_ngfft(1:3): ",work_ngfft(1:3) 6965 ABI_MALLOC(work, (2, work_ngfft(4), work_ngfft(5), work_ngfft(6))) 6966 6967 if (gwr%comm%me == 0) then 6968 call print_ngfft(u_ngfft, header="FFT mesh for chi0 head/wings computation", unit=std_out) 6969 !call print_ngfft(u_ngfft, header="FFT mesh for chi0 head/wings computation", unit=ab_out) 6970 endif 6971 6972 ! Need to broadcast G-vectors at q = 0 if k/q-point parallelism is activated. 6973 if (gwr%kpt_comm%me == 0) then 6974 npwe = gwr%tchi_desc_qibz(1)%npw 6975 ABI_CHECK(gwr%tchi_desc_qibz(1)%kin_sorted, "g-vectors are not sorted by |q+g|^2/2 !") 6976 end if 6977 call xmpi_bcast(npwe, 0, gwr%kpt_comm%value, ierr) 6978 ABI_MALLOC(gvec_q0, (3, npwe)) 6979 if (gwr%kpt_comm%me == 0) gvec_q0 = gwr%tchi_desc_qibz(1)%gvec 6980 call xmpi_bcast(gvec_q0, 0, gwr%kpt_comm%value, ierr) 6981 6982 ! This is needed to call accumulate_head_wings_imagw 6983 call gsph%init(cryst, npwe, gvec_q0) 6984 6985 ABI_MALLOC(gbound_q0, (2 * u_mgfft + 8, 2)) 6986 call sphereboundary(gbound_q0, istwfk1, gvec_q0, u_mgfft, npwe) 6987 6988 ! Init little group to find IBZ_q 6989 use_umklp = 0 6990 call ltg_q%init([zero, zero, zero], gwr%nkbz, gwr%kbz, cryst, use_umklp, npwe) !, gvec=gvec_kss) 6991 6992 nkpt_summed = gwr%nkbz 6993 if (dtset%symchi /= 0) then 6994 nkpt_summed = ltg_q%nibz_ltg 6995 call ltg_q%print(std_out, dtset%prtvol) 6996 end if 6997 !call wrtout(std_out, sjoin(' Calculation status: ', itoa(nkpt_summed), ' k-points to be completed')) 6998 6999 ! ============================================ 7000 ! === Begin big fat loop over transitions ==== 7001 ! ============================================ 7002 7003 ! NB: One might reduce the number of bands as head and wings converge fast wrt nband and slow wrt k-mesh. 7004 ! Should introduce a tolerance on the frequency part computed at the first minimax frequency and 7005 ! compute max_nband from this. 7006 7007 ! Find band1_max from gwr_max_hwtene 7008 band1_max = gwr%ugb_nband 7009 7010 if (gwr%dtset%gwr_max_hwtene > zero) then 7011 ! Set e0 to top of valence band if semiconductor else Fermi level 7012 e0 = now_ebands%fermie 7013 if (all(gwr%ks_gaps%ierr == 0)) e0 = minval(gwr%ks_gaps%vb_max) 7014 do band1_start=1, gwr%ugb_nband 7015 if (all(qp_eig(band1_start,:,:) - e0 > gwr%dtset%gwr_max_hwtene)) then 7016 band1_max = band1_start; exit 7017 end if 7018 end do 7019 !else if (gwr%dtset%gwr_max_hwtene < zero) then 7020 ! band1_max = min(nint(-gwr%dtset%gwr_max_hwtene) gwr%ugb_nband) 7021 end if 7022 7023 call wrtout(std_out, sjoin(" gwr_max_hwtene:", ftoa(gwr%dtset%gwr_max_hwtene * Ha_eV), " (eV)")) 7024 call wrtout(std_out, sjoin(" Using: ", itoa(band1_max), "/", itoa(gwr%ugb_nband), "bands for chi0 head and wings.")) 7025 7026 ! Loop on spin to calculate $\chi_{\up,\up} + \chi_{\down,\down}$ 7027 ! TODO: nspinor 2 7028 nI = 1; nJ = 1; nomega = gwr%ntau 7029 omega(:) = j_dpc * gwr%iw_mesh(:) 7030 ABI_CALLOC(chi0_lwing, (npwe*nI, nomega, 3)) 7031 ABI_CALLOC(chi0_uwing, (npwe*nJ, nomega, 3)) 7032 ABI_CALLOC(chi0_head, (3, 3, nomega)) 7033 7034 ABI_MALLOC(u_gbound, (2 * u_mgfft + 8, 2)) 7035 ABI_MALLOC(ur1_kibz, (u_nfft * nspinor)) 7036 ABI_MALLOC(ur2_kibz, (u_nfft * nspinor)) 7037 ABI_MALLOC(ur_prod, (u_nfft * nspinor)) 7038 dim_rtwg = 1 !; if (nspinor==2) dim_rtwg=2 ! Can reduce size depending on Ep%nI and Ep%nj 7039 ABI_MALLOC(rhotwg, (npwe * dim_rtwg)) 7040 7041 ! TODO: 7042 ddkop = ddkop_new(dtset, gwr%cryst, gwr%pawtab, gwr%psps, gwr%mpi_enreg, u_mpw, u_ngfft) 7043 7044 do my_is=1,gwr%my_nspins 7045 spin = gwr%my_spins(my_is) 7046 7047 ! Loop over my k-points in the BZ. 7048 do my_ikf=1,gwr%my_nkbz 7049 ik_bz = gwr%my_kbz_inds(my_ikf); kk_bz = gwr%kbz(:, ik_bz) 7050 istwf_kf = 1 7051 !istwf_kf = gwt% ??? 7052 7053 if (dtset%symchi == 1 .and. ltg_q%ibzq(ik_bz) /= 1) CYCLE ! Only IBZ_q 7054 print_time = gwr%comm%me == 0 .and. (my_ikf <= LOG_MODK .or. mod(my_ikf, LOG_MODK) == 0) 7055 if (print_time) call cwtime(cpu_k, wall_k, gflops_k, "start") 7056 7057 ! FIXME: Be careful with the symmetry conventions here! 7058 ! and the interplay between umklapp in q and FFT 7059 ! Also, the assembly_chi0 routines assume symrec and trev_k in [1, 2] 7060 ik_ibz = gwr%kbz2ibz_symrel(1, ik_bz); isym_k = gwr%kbz2ibz_symrel(2, ik_bz) 7061 trev_k = gwr%kbz2ibz_symrel(6, ik_bz); g0_k = gwr%kbz2ibz_symrel(3:5, ik_bz) 7062 isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 7063 kk_ibz = gwr%kibz(:, ik_ibz) 7064 7065 ugb_kibz => gwr%ugb(ik_ibz, spin) 7066 desc_ki => gwr%green_desc_kibz(ik_ibz) 7067 npw_ki = desc_ki%npw 7068 istwf_ki = desc_ki%istwfk 7069 kg_ki => desc_ki%gvec 7070 7071 ABI_MALLOC(ug1, (npw_ki * nspinor)) 7072 ABI_MALLOC(ug2, (npw_ki * nspinor)) 7073 7074 call sphereboundary(u_gbound, istwf_ki, kg_ki, u_mgfft, npw_ki) 7075 7076 if (gwr%usepaw == 0 .and. dtset%inclvkb /= 0 .and. gradk_not_done(ik_ibz)) then 7077 ! Include term <n,k|[Vnl,iqr]|n"k>' for q -> 0. 7078 call vkbr_init(vkbr(ik_ibz), cryst, gwr%psps, dtset%inclvkb, istwf_ki, npw_ki, kk_ibz, kg_ki) 7079 gradk_not_done(ik_ibz) = .FALSE. 7080 end if 7081 7082 !call ddkop%setup_spin_kpoint(gwr%dtset, gwr%cryst, gwr%psps, spin, kk_bz, istwf_kk, npw_ki, kg_ki) 7083 7084 !call wfd%copy_cg(ib_v, ik, spin, cg_v) 7085 !call ddkop%apply(ebands%eig(ib_v, ik, spin), npw_k, wfd%nspinor, cg_v, cwaveprj) 7086 7087 !call wfd%copy_cg(ib_c, ik, spin, cg_c) 7088 !vv = ddkop%get_braket(ebands%eig(ib_c, ik, spin), istwf_k, npw_k, nspinor, cg_c, mode=ds%mode) 7089 7090 ! HM: 24/07/2018 7091 ! Transform dipoles to be consistent with results from DFPT 7092 ! Perturbations with DFPT are along the reciprocal lattice vectors 7093 ! Perturbations with Commutator are along real space lattice vectors 7094 ! dot(A, DFPT) = X 7095 ! dot(B, COMM) = X 7096 ! B = 2 pi (A^{-1})^T => 7097 ! dot(B^T B,COMM) = 2 pi DFPT 7098 !vr = (2*pi)*(2*pi)*sum(ihrc(:,:),dim=2) 7099 !vg(1) = dot_product(Cryst%gmet(1,:), vr) 7100 !vg(2) = dot_product(Cryst%gmet(2,:), vr) 7101 !vg(3) = dot_product(Cryst%gmet(3,:), vr) 7102 7103 call chi0_bbp_mask(ik_ibz, ik_ibz, spin, spin_fact, use_tr, & 7104 gwcomp0, spmeth0, gwr%ugb_nband, mband, now_ebands, bbp_mask) 7105 !bbp_mask = .True. 7106 7107 ! FIXME: This part should be tested with tau/g-para 7108 ! TODO: 7109 ! 1) Logic to determine block_size from memory. 7110 ! 2) Add support for symchi = 0 7111 ! 3) Invert the loops 7112 7113 block_size = min(48, gwr%ugb_nband) 7114 !block_size = min(200, gwr%ugb_nband) 7115 !block_size = 1 7116 7117 block_counter = 0 7118 do band1_start=1, gwr%ugb_nband, block_size 7119 block_counter = block_counter + 1 7120 ! Distribute blocks inside tau_comm as wavefunctions are replicated 7121 if (gwr%tau_comm%skip(block_counter)) cycle 7122 7123 if (all(.not. bbp_mask(band1_start:, :))) then 7124 !print *, "exiting band1_start loop" 7125 exit 7126 end if 7127 7128 !print *, "band1_start, gwr%ugb_nband, block_size", band1_start, gwr%ugb_nband, block_size 7129 nb = blocked_loop(band1_start, gwr%ugb_nband, block_size) 7130 band1_stop = band1_start + nb - 1 7131 if (band1_stop > band1_max) exit 7132 7133 ! Collect nb bands starting from band1_start on each proc. 7134 call ugb_kibz%collect_cplx(npw_ki * nspinor, nb, [1, band1_start], ug1_block) 7135 7136 ABI_MALLOC(gh1c_block, (2, npw_ki*nspinor, 3, nb)) 7137 do il_b1=1, ugb_kibz%sizeb_local(2) 7138 band1 = ugb_kibz%loc2gcol(il_b1) 7139 eig_nk = gwr%ks_ebands%eig(band1, ik_ibz, spin) 7140 7141 ! FIXME: This is wrong if spc 7142 !call c_f_pointer(c_loc(ugb_kibz%buffer_cplx(:,il_b1)), cwave, shape=[2, npw_ki*nspinor]) 7143 !call ddkop%apply(eig_nk, npw_ki, nspinor, cwave, cwaveprj) 7144 !gh1c_block(:,:,:,xx_ib) = ddkop%gh1c(:, 1:npw_ki*nspinor,:) 7145 end do 7146 7147 ! Loop over "conduction" states. 7148 !do band1=band1_start, band1_stop 7149 do ib=1,nb 7150 band1 = band1_start + ib - 1 7151 ug1 = ug1_block(:, ib) 7152 call fft_ug(npw_ki, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwf_ki, kg_ki, u_gbound, ug1, ur1_kibz) 7153 !call fft_ug(npw_ki, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwf_ki, kg_ki, u_gbound, ug1_block(:,ib), ur1_kibz) 7154 7155 ! Loop over "valence" states. 7156 !do band2=1,gwr%ugb_nband 7157 do il_b2=1, ugb_kibz%sizeb_local(2) 7158 band2 = ugb_kibz%loc2gcol(il_b2) 7159 7160 deltaeKS_b1b2 = ks_eig(band1, ik_ibz, spin) - ks_eig(band2, ik_ibz, spin) 7161 deltaf_b1b2 = spin_fact * (qp_occ(band1, ik_ibz, spin) - qp_occ(band2, ik_ibz, spin)) 7162 deltaeGW_b1b2 = qp_eig(band1, ik_ibz, spin) - qp_eig(band2, ik_ibz, spin) 7163 7164 ! Skip negligible transitions. 7165 if (abs(deltaf_b1b2) < GW_TOL_DOCC) CYCLE 7166 ! Adler-Wiser expression. 7167 ! Add small imaginary of the Time-Ordered response function but only for non-zero real omega 7168 ! FIXME What about metals? 7169 if (.not. use_tr) then 7170 ! Adler-Wiser without time-reversal. 7171 do io=1,nomega 7172 green_w(io) = g0g0w(omega(io), deltaf_b1b2, deltaeGW_b1b2, zcut, GW_TOL_W0, one_pole) 7173 end do 7174 7175 else 7176 if (band1 < band2) CYCLE ! Here we GAIN a factor ~2 7177 7178 do io=1,nomega 7179 ! Rangel: In metals, the intra-band transitions term does not contain the antiresonant part 7180 ! if(abs(deltaeGW_b1b2)>GW_TOL_W0) green_w(io) = g0g0w(omega(io),deltaf_b1b2,deltaeGW_b1b2,zcut,GW_TOL_W0) 7181 if (band1 == band2) green_w(io) = g0g0w(omega(io), deltaf_b1b2, deltaeGW_b1b2, zcut, GW_TOL_W0, one_pole) 7182 if (band1 /= band2) green_w(io) = g0g0w(omega(io), deltaf_b1b2, deltaeGW_b1b2, zcut, GW_TOL_W0, two_poles) 7183 end do 7184 end if 7185 7186 ug2 = ugb_kibz%buffer_cplx(:, il_b2) 7187 call fft_ug(npw_ki, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwf_ki, kg_ki, u_gbound, ug2, ur2_kibz) 7188 7189 ! FIXME: nspinor 2 is wrong as we have a 2x2 matrix 7190 ur_prod(:) = conjg(ur1_kibz(:)) * ur2_kibz 7191 call fft_ur(npwe, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwfk1, gvec_q0, gbound_q0, ur_prod, rhotwg) 7192 7193 if (gwr%usepaw == 0) then 7194 ! Matrix elements of i[H,r] for NC pseudopotentials. 7195 ! NB ug1 and ug2 are kind=gwpc 7196 rhotwx = nc_ihr_comm(vkbr(ik_ibz), cryst, gwr%psps, npw_ki, nspinor, istwf_ki, gwr%dtset%inclvkb, & 7197 kk_ibz, ug1, ug2, kg_ki) 7198 end if 7199 7200 ! Treat a possible degeneracy between v and c. 7201 ! Adler-Wiser expression, to be consistent here we use the KS eigenvalues (?) 7202 if (abs(deltaeKS_b1b2) > GW_TOL_W0) then 7203 rhotwx = -rhotwx / deltaeKS_b1b2 7204 else 7205 rhotwx = czero_gw 7206 end if 7207 7208 !new_rhotwx = zero 7209 !gh1c_block(:,:,:,ib) = ddkop%gh1c(:, 1:npw_ki*nspinor,:) 7210 7211 ! NB: Using symrec conventions here 7212 ik_ibz = gwr%kbz2ibz(1, ik_bz); isym_k = gwr%kbz2ibz(2, ik_bz) 7213 trev_k = gwr%kbz2ibz(6, ik_bz); g0_k = gwr%kbz2ibz(3:5, ik_bz) 7214 trev_k = trev_k + 1 ! NB: GW routines assume trev in [1, 2] 7215 7216 ! TODO: Metals 7217 call accumulate_head_wings_imagw( & 7218 npwe, nomega, nI, nJ, dtset%symchi, & 7219 is_metallic, ik_bz, isym_k, trev_k, nspinor, cryst, ltg_q, gsph, & 7220 rhotwx, rhotwg, green_w, chi0_head, chi0_lwing, chi0_uwing) 7221 end do ! band2 7222 end do ! band1 7223 7224 ABI_FREE(ug1_block) 7225 ABI_SFREE(gh1c_block) 7226 end do ! band1_start 7227 7228 !if (gwr%usepaw == 0 .and. dtset%inclvkb /= 0 .and. dtset%symchi == 1) then 7229 ! call vkbr_free(vkbr(ik_ibz)) ! Not need anymore as we loop only over IBZ. 7230 !end if 7231 7232 ABI_FREE(ug1) 7233 ABI_FREE(ug2) 7234 if (print_time) then 7235 write(msg,'(4x,3(a,i0),a)')"my_ikf [", my_ikf, "/", gwr%my_nkbz, "] (tot: ", gwr%nkbz, ")" 7236 call cwtime_report(msg, cpu_k, wall_k, gflops_k); if (my_ikf == LOG_MODK) call wrtout(std_out, " ...") 7237 end if 7238 end do ! my_ikf 7239 end do ! my_is 7240 7241 call ddkop%free() 7242 ABI_FREE(bbp_mask) 7243 ABI_FREE(gvec_q0) 7244 ABI_FREE(gbound_q0) 7245 ABI_FREE(work) 7246 ABI_FREE(ur1_kibz) 7247 ABI_FREE(ur2_kibz) 7248 ABI_FREE(ur_prod) 7249 ABI_FREE(rhotwg) 7250 ABI_FREE(u_gbound) 7251 call vkbr_free(vkbr) 7252 ABI_FREE(vkbr) 7253 7254 ! Collect head and wings. 7255 call xmpi_sum(chi0_head, gwr%comm%value, ierr) 7256 call xmpi_sum(chi0_lwing, gwr%comm%value, ierr) 7257 call xmpi_sum(chi0_uwing, gwr%comm%value, ierr) 7258 7259 chi0_head = chi0_head * weight / cryst%ucvol 7260 ! Tensor in terms of reciprocal lattice vectors. 7261 do io=1,nomega 7262 chi0_head(:,:,io) = matmul(chi0_head(:,:,io), cryst%gmet) * (two_pi**2) 7263 end do 7264 chi0_lwing = chi0_lwing * weight / cryst%ucvol 7265 chi0_uwing = chi0_uwing * weight / cryst%ucvol 7266 7267 ! =============================================== 7268 ! ==== Symmetrize chi0 in case of AFM system ==== 7269 ! =============================================== 7270 ! Reconstruct $chi0{\down,\down}$ from $chi0{\up,\up}$. 7271 ! Works only in the case of magnetic group Shubnikov type IV. 7272 if (cryst%use_antiferro) then 7273 call symmetrize_afm_chi0(Cryst, gsph, ltg_q, npwe, nomega, & 7274 chi0_head=chi0_head, chi0_lwing=chi0_lwing, chi0_uwing=chi0_uwing) 7275 end if 7276 7277 if (gwr%comm%me == 0 .and. gwr%dtset%prtvol >= 1) then 7278 ! Construct head and wings from the tensor and output results. 7279 qlen = tol3 7280 call cryst%get_redcart_qdirs(nq, qdirs, qlen=qlen) 7281 ABI_MALLOC(head_qvals, (nq)) 7282 call wrtout(units, " Head of the irreducible polarizability for q --> 0", pre_newlines=1) 7283 call wrtout(units, sjoin(" q0_len:", ftoa(qlen), "(Bohr^-1)")) 7284 write(msg, "(*(a14))") "iomega (eV)", "[100]", "[010]", "[001]", "x", "y", "z" 7285 call wrtout(units, msg) 7286 do io=1,nomega 7287 do iq=1,nq 7288 chq = matmul(chi0_head(:,:,io), qdirs(:,iq)) 7289 head_qvals(iq) = vdotw(qdirs(:, iq), chq, cryst%gmet, "G") 7290 end do 7291 write(msg, "(*(es12.5,2x))") gwr%iw_mesh(io) * Ha_eV, real(head_qvals(:)) 7292 call wrtout(units, msg) 7293 ! Write imag part to std_out only 7294 write(msg, "(*(es12.5,2x))") gwr%iw_mesh(io) * Ha_eV, aimag(head_qvals(:)) 7295 call wrtout(std_out, msg) 7296 end do 7297 call wrtout(units, " ") 7298 ABI_FREE(qdirs) 7299 ABI_FREE(head_qvals) 7300 end if 7301 7302 ! Save quantities for later use as this routine must be called before build_tchi. 7303 if (gwr%kpt_comm%me == 0) then 7304 ABI_REMALLOC(gwr%chi0_head_myw, (3, 3, gwr%my_ntau) ) 7305 ABI_REMALLOC(gwr%chi0_uwing_myw, (3, npwe, gwr%my_ntau) ) 7306 ABI_REMALLOC(gwr%chi0_lwing_myw, (3, npwe, gwr%my_ntau) ) 7307 7308 do my_it=1,gwr%my_ntau 7309 itau = gwr%my_itaus(my_it) 7310 gwr%chi0_head_myw(:,:,my_it) = chi0_head(:,:,itau) 7311 do ii=1,3 7312 gwr%chi0_uwing_myw(ii,:,my_it) = chi0_uwing(:,itau,ii) 7313 gwr%chi0_lwing_myw(ii,:,my_it) = chi0_lwing(:,itau,ii) 7314 end do 7315 end do 7316 end if 7317 7318 ABI_FREE(chi0_lwing) 7319 ABI_FREE(chi0_uwing) 7320 ABI_FREE(chi0_head) 7321 call ltg_q%free() 7322 call gsph%free() 7323 7324 call cwtime_report(" gwr_build_chi0_head_and_wings:", cpu_all, wall_all, gflops_all) 7325 call timab(1927, 2, tsec) 7326 7327 end subroutine gwr_build_chi0_head_and_wings
m_gwr/gwr_build_green [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_green
FUNCTION
Build Green's functions in imaginary time from the gwr%ugb matrices stored in memory. Store only G_k for the IBZ k-points treated by this MPI proc.
INPUTS
free_ugb: True if the gwr%ugb wavefunctions should be deallocated before returning.
SOURCE
2383 subroutine gwr_build_green(gwr, free_ugb) 2384 2385 !Arguments ------------------------------------ 2386 class(gwr_t),target,intent(inout) :: gwr 2387 logical,intent(in) :: free_ugb 2388 2389 !Local variables------------------------------- 2390 !scalars 2391 integer :: my_is, my_iki, spin, ik_ibz, band, itau, ipm, il_b, npwsp, isgn, my_it, nb_occ 2392 real(dp) :: f_nk, eig_nk, cpu, wall, gflops, cpu_k, wall_k, gflops_k 2393 logical :: print_time 2394 character(len=500) :: msg 2395 real(dp) :: gt_rfact 2396 type(__slkmat_t), target :: work_gb, green 2397 !arrays 2398 integer :: mask_kibz(gwr%nkibz), units(2), ija(2), ijb(2) 2399 real(dp) :: tsec(2) 2400 real(dp),contiguous, pointer :: qp_eig(:,:,:), qp_occ(:,:,:) 2401 2402 ! ************************************************************************* 2403 2404 call cwtime(cpu, wall, gflops, "start") 2405 call timab(1922, 1, tsec) 2406 units = [std_out, ab_out] 2407 2408 ! Use KS or QP energies depending on the iteration state. 2409 if (gwr%scf_iteration == 1) then 2410 call wrtout(units, " Building Green's functions from KS orbitals and KS energies...", & 2411 pre_newlines=2, newlines=1, do_flush=.True.) 2412 qp_eig => gwr%ks_ebands%eig; qp_occ => gwr%ks_ebands%occ 2413 msg = sjoin("Fermi energy is not set to zero! fermie:", ftoa(gwr%ks_ebands%fermie)) 2414 ABI_CHECK(abs(gwr%ks_ebands%fermie) < tol12, msg) 2415 2416 ! Allocate my Green's functions in IBZ if this is the first iteration. 2417 mask_kibz = 0; mask_kibz(gwr%my_kibz_inds(:)) = 1 2418 call gwr%malloc_free_mats(mask_kibz, "green", "malloc") 2419 2420 else 2421 call wrtout(units, " Building Green's functions from KS orbitals and QP energies...", & 2422 pre_newlines=2, newlines=1, do_flush=.True.) 2423 qp_eig => gwr%qp_ebands%eig; qp_occ => gwr%qp_ebands%occ 2424 msg = sjoin("Fermi energy is not set to zero! fermie:", ftoa(gwr%qp_ebands%fermie)) 2425 ABI_CHECK(abs(gwr%qp_ebands%fermie) < tol12, msg) 2426 end if 2427 2428 ABI_CHECK(allocated(gwr%ugb), "gwr%ugb array should be allocated!") 2429 2430 do my_is=1,gwr%my_nspins 2431 spin = gwr%my_spins(my_is) 2432 ! Loop over my k-points in the IBZ 2433 do my_iki=1,gwr%my_nkibz 2434 print_time = gwr%comm%me == 0 .and. (my_iki < LOG_MODK .or. mod(my_iki, LOG_MODK) == 0) 2435 if (print_time) call cwtime(cpu_k, wall_k, gflops_k, "start") 2436 ik_ibz = gwr%my_kibz_inds(my_iki) 2437 associate (ugb_ks => gwr%ugb(ik_ibz, spin), desc_k => gwr%green_desc_kibz(ik_ibz)) 2438 npwsp = desc_k%npw * gwr%nspinor 2439 2440 call ugb_ks%copy(work_gb) 2441 !call ugb_ks%change_size_blocs(work_gb, size_blocs=, processor=) 2442 !call work_gb%copy(green, empty=.True.) 2443 2444 ! Init output of pzgemm in g-communicator 2445 call green%init(npwsp, npwsp, gwr%g_slkproc, istwfk1) ! size_blocs=[-1, col_bsize]) 2446 2447 ! Loop over my_ntau as pzgemm is MPI-parallelized inside g_comm. 2448 do my_it=1,gwr%my_ntau 2449 itau = gwr%my_itaus(my_it) 2450 do ipm=1,2 2451 ! Multiply my columns by exponentials in imaginary time. 2452 work_gb%buffer_cplx = ugb_ks%buffer_cplx 2453 2454 !$OMP PARALLEL DO PRIVATE(band, f_nk, eig_nk, gt_rfact) 2455 do il_b=1, work_gb%sizeb_local(2) 2456 band = work_gb%loc2gcol(il_b) 2457 f_nk = qp_occ(band, ik_ibz, spin) 2458 eig_nk = qp_eig(band, ik_ibz, spin) 2459 gt_rfact = zero 2460 if (ipm == 2) then 2461 if (eig_nk < -tol6) gt_rfact = exp(gwr%tau_mesh(itau) * eig_nk) 2462 else 2463 if (eig_nk > tol6) gt_rfact = exp(-gwr%tau_mesh(itau) * eig_nk) 2464 end if 2465 2466 !work_gb%buffer_cplx(:,il_b) = work_gb%buffer_cplx(:,il_b) * sqrt(gt_rfact) 2467 call xscal(npwsp, real(sqrt(gt_rfact), kind=gwpc), work_gb%buffer_cplx(:,il_b), 1) 2468 end do ! il_b 2469 2470 ! Now build G(g,g',ipm) with PZGEMM. 2471 isgn = merge(1, -1, ipm == 2) 2472 ija = [1, 1]; ijb = [1, 1] 2473 ! TODO: optimize 2474 nb_occ = -1 2475 !if (ipm == 1) then 2476 ! ija = [1, nb_occ]; ijb = ija 2477 !else 2478 ! ija = [nb_occ+1, gwr%ugb_nband]; ijb = ija 2479 !end if 2480 call slk_pgemm("N", "C", work_gb, isgn * cone_gw, work_gb, czero_gw, green, ija=ija, ijb=ijb) 2481 2482 ! Redistribute data. 2483 call gwr%gt_kibz(ipm, ik_ibz, itau, spin)%take_from(green) 2484 end do ! ipm 2485 end do ! itau 2486 2487 call work_gb%free(); call green%free() 2488 ! Free wavefunctions if asked for. 2489 if (free_ugb) call ugb_ks%free() 2490 2491 if (print_time) then 2492 write(msg,'(4x,3(a,i0),a)')"G_ikbz [", my_iki, "/", gwr%my_nkibz, "] (tot: ", gwr%nkibz, ")" 2493 call cwtime_report(msg, cpu_k, wall_k, gflops_k); if (my_iki == LOG_MODK) call wrtout(std_out, " ...") 2494 end if 2495 end associate 2496 end do ! my_iki 2497 end do ! my_is 2498 2499 if (gwr%dtset%prtvol > 0) call gwr_print_trace(gwr, "gt_kibz") 2500 call gwr%print_mem(unit=std_out) 2501 2502 call cwtime_report(" gwr_build_green:", cpu, wall, gflops) 2503 call timab(1922, 2, tsec) 2504 2505 end subroutine gwr_build_green
m_gwr/gwr_build_sigmac [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_sigmac
FUNCTION
Build Sigma_c(i tau) and compute matrix elements in the KS basis set.
INPUTS
OUTPUT
SOURCE
5103 subroutine gwr_build_sigmac(gwr) 5104 5105 !Arguments ------------------------------------ 5106 class(gwr_t),target,intent(inout) :: gwr 5107 5108 #ifndef FC_CRAY 5109 !Local variables------------------------------- 5110 !scalars 5111 integer,parameter :: master = 0 5112 integer :: my_is, my_it, spin, ikcalc_ibz, ik_ibz, sc_nfft, my_ir, my_nr, iw, idat, max_ndat, ndat, ii, jj, irow 5113 integer :: iq_ibz, iq_bz, itau, ierr, ibc, bmin, bmax, band, band1 5114 integer :: band2, band2_start, band2_stop, nbc, ib1, ib2, pade_npts 5115 integer :: my_ikf, ipm, ik_bz, ikcalc, uc_ir, ir, ncid, col_bsize, nrsp, sc_nfftsp 5116 integer :: isym_k, trev_k, g0_k(3), tsign_k !, b1gw, b2gw, ! npwsp, my_iqi, sc_ir, ig, my_iqf, 5117 integer(kind=XMPI_ADDRESS_KIND) :: buf_count 5118 integer :: gt_scbox_win, wct_scbox_win, use_umklp, ideg, nstates 5119 real(dp) :: cpu_tau, wall_tau, gflops_tau, cpu_all, wall_all, gflops_all !, cpu, wall, gflops 5120 real(dp) :: mem_mb, cpu_ir, wall_ir, gflops_ir, cpu_ikf, wall_ikf, gflops_ikf 5121 real(dp) :: max_abs_imag_wct, max_abs_re_wct, sck_ucvol, scq_ucvol, wtqm, wtqp 5122 logical :: k_is_gamma, use_shmem_for_k, use_mpi_for_k, isirr_k 5123 logical :: compute_this_kbz, print_time, define, sigc_is_herm, band_inversion 5124 character(len=500) :: msg 5125 !type(desc_t), pointer :: desc_q !, desc_k 5126 type(yamldoc_t) :: ydoc 5127 type(c_ptr) :: void_ptr 5128 !arrays 5129 integer :: sc_ngfft(18), need_qibz(gwr%nqibz), got_qibz(gwr%nqibz), units(2), dat_units(3), g0_q(3) ! gg(3), 5130 integer,allocatable :: green_scgvec(:,:), wc_scgvec(:,:) 5131 real(dp) :: kk_bz(3), kcalc_bz(3), qq_bz(3), tsec(2) !, qq_ibz(3) 5132 complex(gwpc) :: cpsi_r, sigc_pm(2) 5133 complex(dp) :: odd_t(gwr%ntau), even_t(gwr%ntau), avg_2ntau(2,gwr%ntau) 5134 complex(dp),target,allocatable :: sigc_it_mat(:,:,:,:,:,:) 5135 complex(gwpc) ABI_ASYNC, contiguous, pointer :: gt_scbox(:,:,:), wct_scbox(:,:) 5136 complex(gwpc),allocatable :: uc_psir_bk(:,:,:), scph1d_kcalc(:,:,:), uc_ceikr(:), ur(:) 5137 type(__slkmat_t) :: gt_gpr(2, gwr%my_nkbz), gk_rpr_pm(2), sigc_rpr(2,2,gwr%nkcalc), wc_rpr, wc_gpr(gwr%my_nqbz) 5138 type(desc_t), target :: desc_mykbz(gwr%my_nkbz), desc_myqbz(gwr%my_nqbz) 5139 type(fftbox_plan3_t) :: green_plan, wt_plan 5140 type(littlegroup_t) :: ltg_kcalc(gwr%nkcalc) 5141 type(gaps_t) :: new_gaps 5142 integer :: band_val, ibv, ncerr, unt_it, unt_iw, unt_rw 5143 real(dp) :: e0, ks_gap, qp_gap, qp_pade_gap, sigx, vxc_val, vu, v_meanf, eshift, sigma_fact 5144 complex(dp) :: zz, zsc, sigc_e0__, dsigc_de0, z_e0, sig_xc, hhartree_bk, qp_ene, qp_ene_prev 5145 integer,allocatable :: iperm(:) 5146 integer :: gt_request, wct_request 5147 real(dp),allocatable :: sorted_qpe(:) 5148 real(dp) :: e0_kcalc(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol), rw_mesh(gwr%nwr) 5149 real(dp) :: spfunc_diag(gwr%nwr, gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5150 integer :: pade_solver_ierr(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5151 real(dp) :: ks_gaps(gwr%nkcalc, gwr%nsppol), qpz_gaps(gwr%nkcalc, gwr%nsppol) !, qp_pade_gaps(gwr%nkcalc, gwr%nsppol) 5152 complex(dp) :: ze0_kcalc(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5153 complex(dp) :: sigc_e0(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5154 complex(dp) :: qpz_ene(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol), imag_zmesh(gwr%ntau) 5155 complex(dp) :: qp_pade(gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5156 complex(dp) :: sigxc_rw_diag(gwr%nwr, gwr%b1gw:gwr%b2gw, gwr%nkcalc, gwr%nsppol) 5157 type(sigma_pade_t) :: spade 5158 type(sigijtab_t),allocatable :: Sigxij_tab(:,:), Sigcij_tab(:,:) 5159 5160 ! ************************************************************************* 5161 call cwtime(cpu_all, wall_all, gflops_all, "start") 5162 call timab(1925, 1, tsec) 5163 5164 ABI_CHECK(gwr%wc_space == "itau", sjoin("wc_space: ", gwr%wc_space, " != itau")) 5165 5166 !mask_kibz = 0; mask_kibz(gwr%my_kibz_inds(:)) = 1 5167 !call gwr%malloc_free_mats(mask_kibz, "sigma" "malloc") 5168 5169 !if (gwr%scf_iteration == 1) then 5170 !else 5171 !end if 5172 5173 ! Set FFT mesh in the supercell. 5174 ! Be careful when using the FFT plan as ndat can change inside the loop if we start to block. 5175 ! Perhaps the safest approach would be to generate the plan on the fly. 5176 5177 sc_ngfft = gwr%g_ngfft 5178 sc_ngfft(1:3) = gwr%ngkpt * gwr%g_ngfft(1:3) 5179 sc_ngfft(4:6) = sc_ngfft(1:3) 5180 sc_nfft = product(sc_ngfft(1:3)); sc_nfftsp = sc_nfft * gwr%nspinor 5181 !sc_mgfft = maxval(sc_ngfft(1:3)) 5182 sck_ucvol = gwr%cryst%ucvol * product(gwr%ngkpt) 5183 scq_ucvol = gwr%cryst%ucvol * product(gwr%ngqpt) 5184 5185 ! Set FFT mesh used to compute u(r) in the unit cell. 5186 call gwr%kcalc_wfd%change_ngfft(gwr%cryst, gwr%psps, gwr%g_ngfft) 5187 5188 ! Table for \Sigmac_ij matrix elements. 5189 sigc_is_herm = .False. 5190 call sigtk_sigma_tables(gwr%nkcalc, gwr%nkibz, gwr%nsppol, gwr%bstart_ks, gwr%bstop_ks, gwr%kcalc2ibz(:,1), & 5191 gwr%sig_diago, sigc_is_herm, sigxij_tab, sigcij_tab) 5192 5193 call sigijtab_free(Sigxij_tab) 5194 ABI_FREE(Sigxij_tab) 5195 5196 units = [std_out, ab_out] 5197 !if (gwr%sig_diago) then 5198 ! call wrtout(units, " Computing diagonal matrix elements of Sigma_c", pre_newlines=1) 5199 !else 5200 ! call wrtout(units, " Computing diagonal + off-diagonal matrix elements of Sigma_c", pre_newlines=1) 5201 !end if 5202 5203 ! Allocate matrix elements Sigmac_(itau) in the KS basis set. 5204 ii = gwr%b1gw; jj = gwr%b2gw 5205 if (gwr%sig_diago) then 5206 ii = 1; jj = 1 5207 end if 5208 ABI_CALLOC(sigc_it_mat, (2, gwr%ntau, gwr%b1gw:gwr%b2gw, ii:jj, gwr%nkcalc, gwr%nsppol)) 5209 ABI_RECALLOC(gwr%sigc_iw_mat, (gwr%ntau, gwr%b1gw:gwr%b2gw, ii:jj, gwr%nkcalc, gwr%nsppol)) 5210 5211 max_abs_imag_wct = zero; max_abs_re_wct = zero 5212 call gwr%print_mem(unit=std_out) 5213 5214 if (gwr%use_supercell_for_sigma) then 5215 5216 ! NOTE: 5217 ! There are two possibilities here: 5218 ! 5219 ! 1) Compute the matrix elements of Sigma_c in the KS basis set by integrating over the real-space supercell. 5220 ! 5221 ! 2) Compute and store Sigma_c^k(g,g',iomega) and then compute the matrix elements in g-space. 5222 ! 5223 ! The first option requires less memory provided we are interested in a small set of KS states. 5224 ! The second option is interesting if we need to compute several matrix elements, including off-diagonal terms. 5225 call print_sigma_header() 5226 5227 max_ndat = gwr%sc_batch_size 5228 use_mpi_for_k = gwr%sc_batch_size > 1 .and. gwr%sc_batch_size == gwr%kpt_comm%nproc 5229 use_mpi_for_k = .False. 5230 5231 use_shmem_for_k = gwr%sc_batch_size == gwr%kpt_comm%nproc .and. gwr%kpt_comm%nproc > 1 5232 use_shmem_for_k = use_shmem_for_k .and. gwr%kpt_comm%can_use_shmem() 5233 !use_shmem_for_k = .False. 5234 5235 if (use_shmem_for_k) then 5236 buf_count = 2 * (sc_nfftsp * max_ndat * 2) 5237 call gwr%kpt_comm%allocate_shared_master(buf_count, gwpc, xmpi_info_null, void_ptr, gt_scbox_win) 5238 call c_f_pointer(void_ptr, gt_scbox, shape=[sc_nfftsp, max_ndat, 2]) 5239 buf_count = 2 * (sc_nfftsp * max_ndat) 5240 call gwr%kpt_comm%allocate_shared_master(buf_count, gwpc, xmpi_info_null, void_ptr, wct_scbox_win) 5241 call c_f_pointer(void_ptr, wct_scbox, shape=[sc_nfftsp, max_ndat]) 5242 end if 5243 5244 call wrtout(std_out, sjoin(" use_mpi_for_k:", yesno(use_mpi_for_k))) 5245 call wrtout(std_out, sjoin(" use_shmem_for_k:", yesno(use_shmem_for_k))) 5246 mem_mb = 3 * (sc_nfftsp * max_ndat * gwpc) * b2Mb 5247 call wrtout(std_out, sjoin(" Memory for gt_scbox/wct_scbox arrays:", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 5248 5249 if (.not. use_shmem_for_k) then 5250 ABI_CALLOC(gt_scbox, (sc_nfft * gwr%nspinor, max_ndat, 2)) 5251 ABI_CALLOC(wct_scbox, (sc_nfft * gwr%nspinor, max_ndat)) 5252 end if 5253 5254 ! Build plans for dense FFTs. 5255 call green_plan%from_ngfft(sc_ngfft, gwr%nspinor*max_ndat*2, gwr%dtset%gpu_option) 5256 call wt_plan%from_ngfft(sc_ngfft, gwr%nspinor*max_ndat, gwr%dtset%gpu_option) 5257 5258 sigma_fact = one / (sck_ucvol * scq_ucvol) 5259 5260 ! The g-vectors in the supercell for G and tchi. 5261 ABI_MALLOC(green_scgvec, (3, gwr%green_mpw)) 5262 ABI_MALLOC(wc_scgvec, (3, gwr%tchi_mpw)) 5263 5264 do my_is=1,gwr%my_nspins 5265 spin = gwr%my_spins(my_is) 5266 5267 ! Load wavefunctions for GW corrections in the unit cell. 5268 ! TODO: MPI distribute or use MPI shared memory 5269 bmin = minval(gwr%bstart_ks(:, spin)); bmax = maxval(gwr%bstop_ks(:, spin)) 5270 ABI_MALLOC_OR_DIE(uc_psir_bk, (gwr%g_nfft * gwr%nspinor, bmin:bmax, gwr%nkcalc), ierr) 5271 ABI_MALLOC(ur, (gwr%g_nfft * gwr%nspinor)) 5272 ABI_MALLOC(uc_ceikr, (gwr%g_nfft * gwr%nspinor)) 5273 5274 do ikcalc=1,gwr%nkcalc 5275 kcalc_bz = gwr%kcalc(:, ikcalc); ikcalc_ibz = gwr%kcalc2ibz(ikcalc, 1) ! NB: Assuming wfs in the IBZ. 5276 ! Compute e^{ik.r} phases in the unit cell. 5277 call calc_ceikr(kcalc_bz, gwr%g_ngfft, gwr%g_nfft, gwr%nspinor, uc_ceikr) 5278 5279 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5280 call gwr%kcalc_wfd%get_ur(band, ikcalc_ibz, spin, ur) 5281 uc_psir_bk(:, band, ikcalc) = ur * uc_ceikr 5282 end do 5283 end do ! ikcalc 5284 5285 ABI_FREE(ur) 5286 ABI_FREE(uc_ceikr) 5287 5288 ! Pre-compute one-dimensional factors to get 3d e^{ik.L} 5289 call get_1d_sc_phases(gwr%ngkpt, gwr%nkcalc, gwr%kcalc, scph1d_kcalc) 5290 5291 ! Construct Sigma(itau) in the supercell. 5292 do my_it=1,gwr%my_ntau 5293 call cwtime(cpu_tau, wall_tau, gflops_tau, "start") 5294 itau = gwr%my_itaus(my_it) 5295 !if (my_it == 1 .and. gwr%comm%me == 0) call gwr%pstat%print([std_out], reload=.True.) 5296 5297 ! G_k(g,g') --> G_k(g',r) e^{ik.r} for each k in the BZ treated by me. 5298 call gwr%get_myk_green_gpr(itau, spin, desc_mykbz, gt_gpr) 5299 5300 ! Wc_q(g,g') --> Wc_q(g',r) e^{iq.r} for each q in the BZ treated by me. 5301 call gwr%get_myq_wc_gpr(itau, spin, desc_myqbz, wc_gpr) 5302 !if (my_it == 1 .and. gwr%comm%me == 0) call gwr%pstat%print([std_out], reload=.True.) 5303 5304 my_nr = gt_gpr(1,1)%sizeb_local(2) 5305 ABI_CHECK(my_nr == wc_gpr(1)%sizeb_local(2), "my_nr != wc_gpr(1)%sizeb_local(2)") 5306 5307 ! Loop over r in the unit cell that is now MPI-distributed inside g_comm. 5308 do my_ir=1, my_nr, gwr%sc_batch_size 5309 print_time = (gwr%comm%me == 0 .and. (my_ir <= 3 * gwr%sc_batch_size .or. mod(my_ir, LOG_MODR) == 0)) 5310 if (print_time) call cwtime(cpu_ir, wall_ir, gflops_ir, "start") 5311 ndat = blocked_loop(my_ir, my_nr, gwr%sc_batch_size) 5312 uc_ir = gt_gpr(1,1)%loc2gcol(my_ir) ! FIXME: This won't work if nspinor 2 5313 5314 ! TODO: Should block using nproc in kpt_comm, scatter data and perform multiple FFTs in parallel. 5315 if (.not. use_shmem_for_k) then 5316 5317 ! Insert G_k(g',r) in G'-space in the supercell FFT box (ndat vectors starting at my_ir). 5318 call gwr%gk_to_scbox(sc_ngfft, desc_mykbz, green_scgvec, my_ir, ndat, gt_gpr, gt_scbox) 5319 if (gwr%kpt_comm%nproc > 1) call xmpi_isum_ip(gt_scbox, gwr%kpt_comm%value, gt_request, ierr) 5320 5321 ! Insert Wc_q(g',r) in G'-space in the supercell FFT box (ndat vectors starting at my_ir) 5322 call gwr%wcq_to_scbox(sc_ngfft, desc_myqbz, wc_scgvec, my_ir, ndat, wc_gpr, wct_scbox) 5323 if (gwr%kpt_comm%nproc > 1) call xmpi_isum_ip(wct_scbox, gwr%kpt_comm%value, wct_request, ierr) 5324 5325 ! G(G',r) --> G(R',r) 5326 if (gwr%kpt_comm%nproc > 1) call xmpi_wait(gt_request, ierr) 5327 call green_plan%execute(gt_scbox(:,1,1), -1, iscale=0) 5328 5329 ! Wc(G',r) --> Wc(R',r) 5330 if (gwr%kpt_comm%nproc > 1) call xmpi_wait(wct_request, ierr) 5331 call wt_plan%execute(wct_scbox(:,1), -1, iscale=0) 5332 5333 ! Use gt_scbox to store GW (R',r, +/- i tau) for this set of ndat r-point 5334 gt_scbox(:,:,1) = gt_scbox(:,:,1) * wct_scbox(:,:) * sigma_fact 5335 gt_scbox(:,:,2) = gt_scbox(:,:,2) * wct_scbox(:,:) * sigma_fact 5336 !print *, "Maxval abs imag G:", maxval(abs(aimag(gt_scbox))) 5337 5338 else 5339 ! use_shmem_for_k --> MPI shared window version. Only gt_scbox are wct_scbox are shared. 5340 call gwr%gk_to_scbox(sc_ngfft, desc_mykbz, green_scgvec, my_ir, ndat, gt_gpr, gt_scbox, & 5341 gt_scbox_win=gt_scbox_win) 5342 5343 call gwr%wcq_to_scbox(sc_ngfft, desc_myqbz, wc_scgvec, my_ir, ndat, wc_gpr, wct_scbox, & 5344 wct_scbox_win=wct_scbox_win) 5345 5346 ! Now each MPI proc operates on different idat entries. 5347 call xmpi_win_fence(gt_scbox_win) 5348 idat = gwr%kpt_comm%me + 1 5349 if (idat <= ndat) then 5350 call wt_plan%execute(wct_scbox(:,idat), -1, ndat=gwr%nspinor, iscale=0) 5351 do ipm=1,2 5352 call green_plan%execute(gt_scbox(:,idat,ipm), -1, ndat=gwr%nspinor, iscale=0) 5353 gt_scbox(:,idat,ipm) = gt_scbox(:,idat,ipm) * wct_scbox(:,idat) * sigma_fact 5354 end do 5355 end if 5356 !IF (.not. MPI_ASYNC_PROTECTS_NONBLOCKING) CALL MPI_F_SYNC_REG(gt_scbox) 5357 !call xmpi_barrier(gwr%kpt_comm%value) 5358 call xmpi_win_fence(gt_scbox_win) 5359 end if 5360 5361 ! Integrate Sigma matrix elements in the R-supercell for ndat r-points and accumulate. 5362 ! possibly including off-diagonal terms. 5363 do ikcalc=1,gwr%nkcalc 5364 if (gwr%kpt_comm%skip(ikcalc)) cycle ! FIXME: Temporary hack till I find a better MPI algo for k-points. 5365 k_is_gamma = normv(gwr%kcalc(:,ikcalc), gwr%cryst%gmet, "G") < GW_TOLQ0 5366 5367 do band2=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5368 do irow=1,Sigcij_tab(ikcalc, spin)%col(band2)%size1 5369 band1 = Sigcij_tab(ikcalc, spin)%col(band2)%bidx(irow) 5370 do idat=1,ndat 5371 !if (use_shmem_for_k .and. idat /= gwr%kpt_comm + 1) cycle 5372 ir = uc_ir + idat - 1 5373 cpsi_r = conjg(uc_psir_bk(ir, band1, ikcalc)) 5374 do ipm=1,2 5375 call sc_sum(gwr%ngkpt, gwr%g_ngfft, gwr%nspinor, scph1d_kcalc(:,:,ikcalc), k_is_gamma, & 5376 cpsi_r, gt_scbox(:,idat,ipm), uc_psir_bk(:, band2, ikcalc), sigc_pm(ipm)) 5377 end do 5378 if (gwr%sig_diago) then 5379 sigc_it_mat(:, itau,band1,1,ikcalc,spin) = sigc_it_mat(:,itau,band1,1,ikcalc,spin) + sigc_pm(:) 5380 else 5381 sigc_it_mat(:,itau,band1,band2,ikcalc,spin) = sigc_it_mat(:,itau,band1,band2,ikcalc,spin) + sigc_pm(:) 5382 end if 5383 end do ! idat 5384 end do 5385 end do ! band2 5386 end do ! ikcalc 5387 5388 !if (use_shmem_for_k) call xmpi_sum 5389 5390 if (print_time) then 5391 write(msg,'(4x,3(a,i0),a)')"Sigma_c my_ir [", my_ir, "/", my_nr, "] (tot: ", gwr%g_nfft, ")" 5392 call cwtime_report(msg, cpu_ir, wall_ir, gflops_ir) 5393 end if 5394 end do ! my_ir 5395 5396 ! Free descriptors and PBLAS matrices in kBZ and qBZ. 5397 call desc_array_free(desc_mykbz); call desc_array_free(desc_myqbz) 5398 call slk_array_free(gt_gpr); call slk_array_free(wc_gpr) 5399 5400 write(msg,'(1x,3(a,i0),a)')"Sigma_c my_itau [", my_it, "/", gwr%my_ntau, "] (tot: ", gwr%ntau, ")" 5401 call cwtime_report(msg, cpu_tau, wall_tau, gflops_tau, end_str=ch10) 5402 end do ! my_it 5403 5404 ABI_FREE(scph1d_kcalc) 5405 ABI_FREE(uc_psir_bk) 5406 end do ! my_is 5407 5408 sigc_it_mat = -sigc_it_mat * (gwr%cryst%ucvol / gwr%g_nfft) ** 2 5409 5410 !call wrtout(std_out, sjoin(" Maxval abs re W:", ftoa(max_abs_re_wct))) 5411 !call wrtout(std_out, sjoin(" Maxval abs imag W:", ftoa(max_abs_imag_wct))) 5412 if (.not. use_shmem_for_k) then 5413 ABI_FREE(gt_scbox) 5414 ABI_FREE(wct_scbox) 5415 else 5416 call xmpi_win_free(gt_scbox_win) 5417 call xmpi_win_free(wct_scbox_win) 5418 end if 5419 5420 call green_plan%free() 5421 call wt_plan%free() 5422 5423 ABI_FREE(green_scgvec) 5424 ABI_FREE(wc_scgvec) 5425 5426 else 5427 ! =================================================================== 5428 ! Mixed-space algorithm in the unit cell with convolutions in k-space 5429 ! =================================================================== 5430 call print_sigma_header() 5431 5432 ! Define tables to account for symmetries: 5433 ! - when looping over the BZ, we only need to include the union of IBZ_x for x in kcalc. 5434 ! - when accumulating the self-energy, we have to use weights that depend on x. 5435 5436 ! * The little group is needed when symsigma == 1 5437 ! * If use_umklp == 1 then symmetries requiring an umklapp to preserve k_gw are included as well. 5438 ! * Note that TR is not yet supported so timrev is set to 1 even if TR has been used to generate the GS IBZ. 5439 use_umklp = 1 5440 do ikcalc=1,gwr%nkcalc 5441 call ltg_kcalc(ikcalc)%init(gwr%kcalc(:,ikcalc), gwr%nkbz, gwr%kbz, gwr%cryst, use_umklp, npwe=0, timrev=1) 5442 call ltg_kcalc(ikcalc)%print(unit=std_out, prtvol=gwr%dtset%prtvol) 5443 end do 5444 5445 ! Allocate PBLAS matrices to store Wc_q(r',r,tau), and Sigma_kcalc(r',r,+/-tau) in the unit cell. 5446 nrsp = gwr%g_nfft * gwr%nspinor 5447 col_bsize = nrsp / gwr%g_comm%nproc; if (mod(nrsp, gwr%g_comm%nproc) /= 0) col_bsize = col_bsize + 1 5448 5449 call wc_rpr%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 5450 do ipm=1,2 5451 call gk_rpr_pm(ipm)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 5452 do ikcalc=1,gwr%nkcalc 5453 call sigc_rpr(1,ipm,ikcalc)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 5454 ! For sigma we have to decompose it in hermitian/anti-hermitian part. 5455 !call sigc_rpr(2,ipm,ikcalc)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 5456 end do 5457 end do 5458 5459 mem_mb = slk_array_locmem_mb(wc_rpr) + sum(slk_array_locmem_mb(gk_rpr_pm)) + sum(slk_array_locmem_mb(sigc_rpr)) 5460 call wrtout(std_out, sjoin(" Local memory for PBLAS (r,r') matrices: ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 5461 5462 do my_is=1,gwr%my_nspins 5463 spin = gwr%my_spins(my_is) 5464 5465 ! Load wavefunctions for GW corrections in the real-space unit cell. 5466 ! TODO: MPI distribute or use MPI shared memory 5467 bmin = minval(gwr%bstart_ks(:, spin)); bmax = maxval(gwr%bstop_ks(:, spin)) 5468 ABI_MALLOC_OR_DIE(uc_psir_bk, (nrsp, bmin:bmax, gwr%nkcalc), ierr) 5469 ABI_MALLOC(ur, (nrsp)) 5470 5471 do ikcalc=1,gwr%nkcalc 5472 kcalc_bz = gwr%kcalc(:, ikcalc); ikcalc_ibz = gwr%kcalc2ibz(ikcalc, 1) ! NB: Assuming wfs in IBZ 5473 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5474 call gwr%kcalc_wfd%get_ur(band, ikcalc_ibz, spin, ur) 5475 uc_psir_bk(:, band, ikcalc) = ur 5476 end do 5477 end do 5478 ABI_FREE(ur) 5479 5480 need_qibz = 0 5481 do my_ikf=1,gwr%my_nkbz 5482 ik_bz = gwr%my_kbz_inds(my_ikf); kk_bz = gwr%kbz(:, ik_bz) 5483 do ikcalc=1,gwr%nkcalc 5484 qq_bz = gwr%kcalc(:,ikcalc) - kk_bz 5485 !qq_bz = -qq_bz 5486 ! TODO: here I may need to take into account the umklapp 5487 call findqg0(iq_bz, g0_q, qq_bz, gwr%nqbz, gwr%qbz, gwr%mG0) 5488 !ABI_CHECK(all(g0_q == 0), sjoin("g0_q != 0, kcalc", ktoa(gwr%kcalc(:,ikcalc)), "kk_bz:", ktoa(kk_bz))) 5489 iq_ibz = gwr%qbz2ibz(1, iq_bz) 5490 need_qibz(iq_ibz) = 1 5491 end do 5492 end do 5493 5494 ! Construct Sigma(itau) using convolutions in k-space and real-space representation in the unit cell. 5495 do my_it=1,gwr%my_ntau 5496 call cwtime(cpu_tau, wall_tau, gflops_tau, "start") 5497 itau = gwr%my_itaus(my_it) 5498 5499 ! Redistribute W_q(g,g') in the IBZ so that each MPI proc can reconstruct Wc_q in the BZ inside the loops 5500 call gwr%redistrib_mats_qibz("wc", itau, spin, need_qibz, got_qibz, "communicate") 5501 call slk_array_set(sigc_rpr, czero) 5502 5503 ! Sum over my k-points in the BZ. 5504 do my_ikf=1,gwr%my_nkbz 5505 print_time = (gwr%comm%me == 0 .and. (my_ikf <= LOG_MODK .or. mod(my_ikf, LOG_MODK) == 0)) 5506 if (print_time) call cwtime(cpu_ikf, wall_ikf, gflops_ikf, "start") 5507 ik_bz = gwr%my_kbz_inds(my_ikf); kk_bz = gwr%kbz(:,ik_bz) 5508 5509 ik_ibz = gwr%kbz2ibz(1, ik_bz); isym_k = gwr%kbz2ibz(2, ik_bz) 5510 trev_k = gwr%kbz2ibz(6, ik_bz); g0_k = gwr%kbz2ibz(3:5, ik_bz) 5511 isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 5512 tsign_k = merge(1, -1, trev_k == 0) 5513 !if (.not. isirr_k) cycle 5514 5515 ! Skip this BZ k-point if it's not in the IBZ(ikcalc) of some ikcalc. 5516 compute_this_kbz = .True. 5517 if (gwr%dtset%symsigma /= 0) then 5518 compute_this_kbz = .False. 5519 do ikcalc=1,gwr%nkcalc 5520 if (ltg_kcalc(ikcalc)%ibzq(ik_bz) == 1) then 5521 compute_this_kbz = .True.; exit 5522 end if 5523 end do 5524 end if 5525 if (.not. compute_this_kbz) cycle ! my_ikf loop 5526 5527 ! Use symmetries to get G_kbz from the IBZ then G_k(g,g') --> G_k(r',r) 5528 call gwr%get_gkbz_rpr_pm(ik_bz, itau, spin, gk_rpr_pm) 5529 5530 do ikcalc=1,gwr%nkcalc 5531 if (gwr%dtset%symsigma /= 0 .and. ltg_kcalc(ikcalc)%ibzq(ik_bz) == 0) cycle ! FIXME: iq_bz or ikq? 5532 qq_bz = gwr%kcalc(:, ikcalc) - kk_bz 5533 !qq_bz = -qq_bz 5534 ! TODO: here I may need to take into account the umklapp 5535 call findqg0(iq_bz, g0_q, qq_bz, gwr%nqbz, gwr%qbz, gwr%mG0) 5536 !ABI_CHECK(all(g0_q == 0), sjoin("g0_q != 0", ktoa(gwr%kcalc(:,ikcalc)), "kk_bz", ktoa(kk_bz))) 5537 !iq_ibz = gwr%qbz2ibz(1, iq_bz) 5538 call gwr%get_wc_rpr_qbz(g0_q, iq_bz, itau, spin, wc_rpr) 5539 5540 ! The integration weight depends on ikcalc 5541 wtqp = one / gwr%nkbz; wtqm = zero 5542 if (gwr%dtset%symsigma /= 0) then 5543 ! If symsigma, symmetrize the matrix elements. 5544 ! Sum only q"s in IBZ_k. In this case elements are weighted 5545 ! according to wtqp and wtqm. wtqm is for time-reversal. 5546 !call ltg_kcalc(ikcalc)%get_weigts_ibz(ik_bz, wkbz_pm) 5547 associate (ltg_k => ltg_kcalc(ikcalc)) 5548 !if (can_symmetrize(spin)) then 5549 wtqp = (one * sum(ltg_k%wtksym(1,:,ik_bz))) / gwr%nkbz ! FIXME: iq_bz or ik_bz? 5550 wtqm = (one * sum(ltg_k%wtksym(2,:,ik_bz))) / gwr%nkbz 5551 end associate 5552 end if 5553 5554 do ipm=1,2 5555 if (abs(wtqm) < tol12) then 5556 sigc_rpr(1,ipm,ikcalc)%buffer_cplx = sigc_rpr(1,ipm,ikcalc)%buffer_cplx + & 5557 wtqp * gk_rpr_pm(ipm)%buffer_cplx * wc_rpr%buffer_cplx 5558 else 5559 ABI_ERROR(sjoin("TR is not yet implemented:, wqtm:", ftoa(wtqm))) 5560 sigc_rpr(1,ipm,ikcalc)%buffer_cplx = sigc_rpr(1,ipm,ikcalc)%buffer_cplx + & 5561 wtqp * (gk_rpr_pm(ipm)%buffer_cplx * wc_rpr%buffer_cplx) 5562 5563 sigc_rpr(2,ipm,ikcalc)%buffer_cplx = sigc_rpr(2,ipm,ikcalc)%buffer_cplx + & 5564 wtqm * conjg(gk_rpr_pm(ipm)%buffer_cplx * wc_rpr%buffer_cplx) 5565 5566 !sigc_rpr(1, ipm, ikcalc)%buffer_cplx = sigc_rpr(1, ipm, ikcalc)%buffer_cplx + & 5567 ! (wtqp + wtqm) * real(gk_rpr_pm(ipm)%buffer_cplx * wc_rpr%buffer_cplx, kind=gwpc) & 5568 ! + (wtqp - wtqm) * j_gw * aimag(gk_rpr_pm(ipm)%buffer_cplx * wc_rpr%buffer_cplx) 5569 end if 5570 end do ! ipm 5571 5572 end do ! ikcalc 5573 5574 if (print_time) then 5575 write(msg,'(4x,3(a,i0),a)')"Sigma_c my_ikf [", my_ikf, "/", gwr%my_nkbz, "] (tot: ", gwr%nkbz, ")" 5576 call cwtime_report(msg, cpu_ikf, wall_ikf, gflops_ikf) 5577 end if 5578 end do ! my_ikf 5579 5580 ! Deallocate extra Wc matrices defined by got_qibz 5581 call gwr%redistrib_mats_qibz("wc", itau, spin, need_qibz, got_qibz, "free") 5582 5583 ! Integrate self-energy matrix elements in the unit cell. 5584 ! Remember that Sigma is stored as (r',r) and that the second dimension is MPI-distributed. 5585 ! In case of k or g distribution, sigc_pm is a partial 6d integral that will be ALL_REDUCED in gwr%comm afterwards. 5586 ! TODO: Off-diagonal terms although this is not the most efficient algorithm 5587 do ikcalc=1,gwr%nkcalc 5588 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5589 call sig_braket_ur(sigc_rpr(:,:,ikcalc), gwr%g_nfft*gwr%nspinor, uc_psir_bk(:,band,ikcalc), sigc_pm) 5590 if (gwr%sig_diago) then 5591 sigc_it_mat(:, itau, band, 1, ikcalc, spin) = sigc_pm 5592 end if 5593 end do 5594 end do ! ikcalc 5595 !ABI_FREE(loc_cwork) 5596 5597 write(msg,'(3(a,i0),a)')" Sigma_c my_itau [", my_it, "/", gwr%my_ntau, "] (tot: ", gwr%ntau, ")" 5598 call cwtime_report(msg, cpu_tau, wall_tau, gflops_tau) 5599 end do ! my_it 5600 5601 ABI_FREE(uc_psir_bk) 5602 end do ! my_is 5603 5604 sigc_it_mat = -sigc_it_mat * (one/gwr%g_nfft) ** 2 5605 5606 call wc_rpr%free(); call slk_array_free(sigc_rpr); call slk_array_free(gk_rpr_pm) 5607 do ikcalc=1,gwr%nkcalc 5608 call ltg_kcalc(ikcalc)%free() 5609 end do 5610 call wrtout(std_out, " Mixed space algorithm for sigma completed") 5611 end if 5612 5613 call sigijtab_free(Sigcij_tab) 5614 ABI_FREE(Sigcij_tab) 5615 5616 ! Collect results and average 5617 call xmpi_sum(sigc_it_mat, gwr%comm%value, ierr) 5618 5619 if (gwr%dtset%symsigma == +1 .and. .not. gwr%use_supercell_for_sigma) then 5620 call wrtout(std_out, " Averaging Sig_c matrix elements within degenerate subspaces.") 5621 ABI_CHECK(gwr%sig_diago, "symsigma = 1 requires diagonal Sigma_c") 5622 do spin=1,gwr%nsppol 5623 do ikcalc=1,gwr%nkcalc 5624 do ideg=1,size(gwr%degtab(ikcalc, spin)%bids) 5625 associate (bids => gwr%degtab(ikcalc, spin)%bids(ideg)%vals) 5626 nstates = size(bids) 5627 avg_2ntau = sum(sigc_it_mat(:,:,bids(:), 1,ikcalc, spin), dim=3) / nstates 5628 do ii=1,nstates 5629 sigc_it_mat(:,:,bids(ii), 1,ikcalc, spin) = avg_2ntau 5630 end do 5631 end associate 5632 end do ! ideg 5633 end do 5634 end do 5635 end if ! symsigma == +1 5636 5637 ! Store matrix elements of Sigma_c(it), separate even and odd part 5638 ! then use sine/cosine transform to get Sigma_c(i omega). 5639 ! Finally, perform analytic continuation with Pade' to go to the real-axis 5640 ! and compute QP corrections and spectral functions. All procs execute this part as it's very cheap. 5641 5642 imag_zmesh(:) = j_dpc * gwr%iw_mesh 5643 5644 ! Save previous QP bands in qp_ebands_prev (needed for self-consistency) 5645 ! In the loop below, we also update gwr%qp_ebands%eig with the QP results and recompute occ/fermie. 5646 gwr%qp_ebands_prev%eig = gwr%qp_ebands%eig 5647 gwr%qp_ebands_prev%occ = gwr%qp_ebands%occ 5648 5649 e0_kcalc = zero; spfunc_diag = zero; pade_solver_ierr = 0; ze0_kcalc = zero; sigc_e0 = zero 5650 qpz_ene = zero; qp_pade = zero; sigxc_rw_diag = zero 5651 ks_gaps = -one; qpz_gaps = -one !; qp_pade_gaps = -one 5652 5653 do spin=1,gwr%nsppol 5654 do ikcalc=1,gwr%nkcalc 5655 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 5656 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5657 ibc = band - gwr%bstart_ks(ikcalc, spin) + 1 5658 5659 ! FT Sigma(itau) --> Sigma(iw) 5660 band2_start = 1; band2_stop = 1 5661 if (.not. gwr%sig_diago) then 5662 band2_start = gwr%bstart_ks(ikcalc, spin); band2_stop = gwr%bstop_ks(ikcalc, spin) 5663 end if 5664 do band2=band2_start, band2_stop 5665 ! f(t) = E(t) + O(t) = (f(t) + f(-t)) / 2 + (f(t) - f(-t)) / 2 5666 associate (vals_pmt => sigc_it_mat(:,:, band, band2 ,ikcalc, spin)) 5667 even_t = (vals_pmt(1,:) + vals_pmt(2,:)) / two; odd_t = (vals_pmt(1,:) - vals_pmt(2,:)) / two 5668 gwr%sigc_iw_mat(:, band, band2, ikcalc, spin) = matmul(gwr%cosft_wt, even_t) + j_dpc * matmul(gwr%sinft_wt, odd_t) 5669 end associate 5670 end do 5671 5672 ! NB: e0 is always set to the KS energy even in case of self-consistency. 5673 e0 = gwr%ks_ebands%eig(band, ik_ibz, spin) 5674 if ( gwr%sig_diago) sigx = gwr%sigx_mat(band, 1, ikcalc, spin) 5675 if (.not. gwr%sig_diago) sigx = gwr%sigx_mat(band, band, ikcalc, spin) 5676 5677 ! Note vxc[n_val] instead of vxc[n_val + n_nlcc] with the model core charge. 5678 vxc_val = gwr%ks_me%vxcval(band, band, ik_ibz, spin) 5679 vu = zero; if (gwr%dtset%usepawu /= 0) vu = gwr%ks_me%vu(band, band, ik_ibz, spin) 5680 v_meanf = vxc_val + vu 5681 5682 band2 = merge(1, band, gwr%sig_diago) 5683 pade_npts = gwr%ntau 5684 if (gwr%dtset%userie > 0 .and. pade_npts > gwr%dtset%userie) then 5685 pade_npts = min(gwr%ntau, gwr%dtset%userie) 5686 call wrtout(std_out, sjoin("Limiting the number of points for pade to:", itoa(pade_npts))) 5687 end if 5688 call spade%init(pade_npts, imag_zmesh, gwr%sigc_iw_mat(:, band, band2, ikcalc, spin), branch_cut=">") 5689 5690 ! Solve the QP equation with Newton-Rapson starting from e0 5691 zz = cmplx(e0, zero) 5692 call spade%qp_solve(e0, v_meanf, sigx, zz, zsc, msg, ierr) 5693 qp_pade(band, ikcalc, spin) = zsc 5694 pade_solver_ierr(band, ikcalc, spin) = ierr 5695 ABI_WARNING_IF(ierr /= 0, msg) 5696 5697 call spade%eval(zz, sigc_e0__, dzdval=dsigc_de0) 5698 ! Z = (1 - dSigma / domega(E0))^{-1} 5699 z_e0 = one / (one - dsigc_de0) 5700 5701 ! Compute linearized QP solution and store results 5702 qp_ene = e0 + z_e0 * (sigc_e0__ + sigx - v_meanf) 5703 qpz_ene(band, ikcalc, spin) = qp_ene 5704 e0_kcalc(band, ikcalc, spin) = e0 5705 sigc_e0(band, ikcalc, spin) = sigc_e0__ 5706 ze0_kcalc(band, ikcalc, spin) = z_e0 5707 5708 ! IMPORTANT: Here we update qp_ebands%eig with the new enes obtained with the linearized QP equation 5709 gwr%qp_ebands%eig(band, ik_ibz, spin) = real(qp_ene) 5710 5711 ! Compute Spectral function using linear mesh **centered** around KS e0. 5712 rw_mesh = arth(e0 - gwr%wr_step * (gwr%nwr / 2), gwr%wr_step, gwr%nwr) 5713 hhartree_bk = gwr%ks_ebands%eig(band, ik_ibz, spin) - v_meanf 5714 do iw=1,gwr%nwr 5715 zz = rw_mesh(iw) 5716 call spade%eval(zz, sigc_e0__) 5717 sig_xc = sigx + sigc_e0__ 5718 sigxc_rw_diag(iw, band, ikcalc, spin) = sig_xc 5719 5720 spfunc_diag(iw, band, ikcalc, spin) = one / pi * abs(aimag(sigc_e0__)) & 5721 / ( (real(rw_mesh(iw) - hhartree_bk - sig_xc)) ** 2 + (aimag(sigc_e0__)) ** 2) ! / Ha_eV 5722 5723 !Sr%hhartree = hdft - KS_me%vxcval 5724 !spfunc_diag(iw, band, ikcalc, spin) = & 5725 ! one / pi * abs(aimag(sigc_e0__)) & 5726 ! /( (real(rw_mesh(iw) - Sr%hhartree(ib, ib, ik_ibz, spin) - sigx_xc)) ** 2 & 5727 ! +(aimag(sigc_e0__)) ** 2) / Ha_eV 5728 end do ! iw 5729 5730 end do ! band 5731 end do ! ikcalc 5732 end do ! spin 5733 5734 if (gwr%nkcalc == gwr%nkibz) then 5735 ! Shift the bands that are not explicitly included in the SCF calculation. 5736 ! using the correction evaluated at bstop_ks/bstart_ks to accelerate self-consistent calculations. 5737 do spin=1,gwr%nsppol 5738 do ikcalc=1,gwr%nkcalc 5739 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 5740 band = gwr%bstop_ks(ikcalc, spin) 5741 if (band + 1 <= size(gwr%qp_ebands%eig, dim=1)) then 5742 eshift = gwr%qp_ebands%eig(band, ik_ibz, spin) - gwr%qp_ebands_prev%eig(band, ik_ibz, spin) 5743 call wrtout(std_out, sjoin(" Correcting bands >= ", itoa(band+1), "with eshift:", ftoa(eshift * Ha_meV), "(meV)")) 5744 gwr%qp_ebands%eig(band + 1:, ik_ibz, spin) = gwr%qp_ebands%eig(band + 1:, ik_ibz, spin) + eshift 5745 end if 5746 band = gwr%bstart_ks(ikcalc, spin) 5747 if (band > 1) then ! unlikely 5748 eshift = gwr%qp_ebands%eig(band, ik_ibz, spin) - gwr%qp_ebands_prev%eig(band, ik_ibz, spin) 5749 call wrtout(std_out, sjoin(" Correcting bands < ", itoa(band), "with eshift:", ftoa(eshift * Ha_meV), "(meV)")) 5750 gwr%qp_ebands%eig(:band - 1, ik_ibz, spin) = gwr%qp_ebands%eig(:band - 1, ik_ibz, spin) + eshift 5751 end if 5752 end do 5753 end do 5754 5755 ! Recompute occupancies and set fermie to zero. 5756 ! FIXME: Possible problem here if the QP energies are not ordered! 5757 call ebands_update_occ(gwr%qp_ebands, gwr%dtset%spinmagntarget, prtvol=gwr%dtset%prtvol, fermie_to_zero=.True.) 5758 end if 5759 5760 if (gwr%comm%me == 0) then 5761 ! Master writes results to ab_out, std_out and GWR.nc 5762 if (any(pade_solver_ierr /= 0)) then 5763 ! Write warning if QP solver failed. 5764 ierr = count(pade_solver_ierr /= 0) 5765 call wrtout([ab_out, std_out], sjoin("QP solver failed for:", itoa(ierr), "states")) 5766 end if 5767 5768 call write_notations([std_out, ab_out]) 5769 do spin=1,gwr%nsppol 5770 do ikcalc=1,gwr%nkcalc 5771 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 5772 5773 ydoc = yamldoc_open('GWR_SelfEnergy_ee', width=11, real_fmt='(3f8.3)') 5774 call ydoc%add_real1d('kpoint', gwr%kcalc(:, ikcalc)) 5775 call ydoc%add_int('spin', spin, int_fmt="(i1)") 5776 call ydoc%add_int('gwr_scf_iteration', gwr%scf_iteration) 5777 call ydoc%add_string('gwr_task', gwr%dtset%gwr_task) 5778 5779 ! Compute gaps assumim KS band indices. 5780 band_val = gwr%ks_vbik(ik_ibz, spin) 5781 nbc = gwr%bstop_ks(ikcalc, spin) - gwr%bstart_ks(ikcalc, spin) + 1 5782 ib1 = gwr%bstart_ks(ikcalc, spin); ib2 = gwr%bstop_ks(ikcalc, spin) 5783 5784 if (band_val >= gwr%bstart_ks(ikcalc, spin) .and. band_val + 1 <= gwr%bstop_ks(ikcalc, spin)) then 5785 ibv = band_val - gwr%bstart_ks(ikcalc, spin) + 1 5786 ks_gap = gwr%ks_ebands%eig(band_val+1, ik_ibz, spin) - gwr%ks_ebands%eig(band_val, ik_ibz, spin) 5787 5788 ! This to detect a possible band inversion and compute qp_gaps accordingly. 5789 band_inversion = .False. 5790 call sort_rvals(nbc, real(qpz_ene(ib1:, ikcalc, spin)), iperm, sorted_qpe, tol=tol12) 5791 5792 if (iperm(ibv) /= ibv .or. iperm(ibv + 1) /= ibv + 1) then 5793 band_inversion = .True. 5794 call ydoc%add_int('QP_VBM_band', iperm(ibv) + gwr%bstart_ks(ikcalc, spin) - 1) 5795 call ydoc%add_int('QP_CBM_band', iperm(ibv+1) + gwr%bstart_ks(ikcalc, spin) - 1) 5796 qp_gap = sorted_qpe(ibv+1) - sorted_qpe(ibv) 5797 !qp_pade_gap = qp_pade(band_val+1, ikcalc, spin) - qp_pade(band_val, ikcalc, spin) 5798 else 5799 call ydoc%add_int('QP_VBM_band', ibv + gwr%bstart_ks(ikcalc, spin) - 1) 5800 call ydoc%add_int('QP_CBM_band', ibv+1 + gwr%bstart_ks(ikcalc, spin) - 1) 5801 qp_gap = gwr%qp_ebands%eig(band_val+1, ik_ibz, spin) - gwr%qp_ebands%eig(band_val, ik_ibz, spin) 5802 qp_pade_gap = qp_pade(band_val+1, ikcalc, spin) - qp_pade(band_val, ikcalc, spin) 5803 end if 5804 ABI_FREE(iperm) 5805 ABI_FREE(sorted_qpe) 5806 5807 call ydoc%add_real('KS_gap', ks_gap * Ha_eV) 5808 call ydoc%add_real('QP_gap', qp_gap * Ha_eV) 5809 call ydoc%add_real('Delta_QP_KS', (qp_gap - ks_gap) * Ha_eV) 5810 ks_gaps(ikcalc, spin)= ks_gap 5811 qpz_gaps(ikcalc, spin) = qp_gap 5812 !qp_pade_gaps(ikcalc, spin) = qp_pade_gap 5813 end if 5814 5815 call ydoc%open_tabular('data') !, tag='SigmaeeData') 5816 write(msg, "(a5, *(a9))") "Band", "E0", "<VxcDFT>", "SigX", "SigC(E0)", "Z", "E-E0", "E-Eprev", "E", "Occ(E)" 5817 call ydoc%add_tabular_line(msg) 5818 5819 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5820 ibc = band - gwr%bstart_ks(ikcalc, spin) + 1 5821 e0 = gwr%ks_ebands%eig(band, ik_ibz, spin) 5822 qp_ene = gwr%qp_ebands%eig(band, ik_ibz, spin) 5823 qp_ene_prev = gwr%qp_ebands_prev%eig(band, ik_ibz, spin) 5824 if ( gwr%sig_diago) sigx = gwr%sigx_mat(band, 1, ikcalc, spin) 5825 if (.not. gwr%sig_diago) sigx = gwr%sigx_mat(band, band, ikcalc, spin) 5826 5827 write(msg,'(i5, *(f9.3))') & 5828 band, & ! Band 5829 e0 * Ha_eV, & ! E0 5830 real(gwr%ks_me%vxcval(band, band, ik_ibz, spin)) * Ha_eV, & ! <VxcDFT> 5831 real(sigx) * Ha_eV, & ! SigX 5832 real(sigc_e0(band, ikcalc, spin)) * Ha_eV, & ! SigC(E0) 5833 real(ze0_kcalc(band, ikcalc, spin)), & ! Z 5834 (real(qp_ene - e0)) * Ha_eV, & ! E-E0 5835 real(qp_ene - qp_ene_prev) * Ha_eV, & ! E-Eprev 5836 real(qp_ene) * Ha_eV, & ! E 5837 gwr%qp_ebands%occ(band, ik_ibz, spin) ! Occ(E) 5838 call ydoc%add_tabular_line(msg) 5839 end do 5840 5841 call ydoc%write_units_and_free([std_out, ab_out]) 5842 end do ! ikcalc 5843 end do ! spin 5844 5845 ! Print KS and QP gaps 5846 msg = "Kohn-Sham gaps and band edges from IBZ mesh" 5847 call gwr%ks_gaps%print(unit=std_out, header=msg) 5848 call gwr%ks_gaps%print(unit=ab_out, header=msg) 5849 5850 new_gaps = ebands_get_gaps(gwr%qp_ebands, ierr) 5851 write(msg,"(a,i0,a)")" QP gaps and band edges taking into account Sigma_nk corrections for ",gwr%nkcalc," k-points" 5852 call new_gaps%print(unit=std_out, header=msg) 5853 call new_gaps%print(unit=ab_out, header=msg) 5854 if (ierr /= 0) then 5855 ABI_WARNING("Cannot compute QP fundamental and direct gap (likely metal)") 5856 end if 5857 call new_gaps%free() 5858 5859 ! Write results to txt files. 5860 if (open_file(strcat(gwr%dtfil%filnam_ds(4), '_SIGC_IT'), msg, newunit=unt_it, action="write") /= 0) then 5861 ABI_ERROR(msg) 5862 end if 5863 write(unt_it, "(a)")"# Diagonal elements of Sigma_c(i tau, +/-) in atomic units" 5864 write(unt_it, "(a)")"# tau Re/Im Sigma_c(+itau) Re/Im Sigma_c(-itau)" 5865 5866 if (open_file(strcat(gwr%dtfil%filnam_ds(4), '_SIGXC_IW'), msg, newunit=unt_iw, action="write") /= 0) then 5867 ABI_ERROR(msg) 5868 end if 5869 write(unt_iw, "(a)")"# Diagonal elements of Sigma_xc(i omega) in eV units" 5870 write(unt_iw, "(a)")"# omega Re/Im Sigma_c(i omega)" 5871 5872 if (open_file(strcat(gwr%dtfil%filnam_ds(4), '_SIGXC_RW'), msg, newunit=unt_rw, action="write") /= 0) then 5873 ABI_ERROR(msg) 5874 end if 5875 write(unt_rw, "(a)")"# Diagonal elements of Sigma_xc(omega) in eV units and spectral function A(omega)" 5876 write(unt_rw, "(a)")"# omega Re/Im Sigma_xc(omega), A(omega)" 5877 5878 dat_units = [unt_it, unt_iw, unt_rw] 5879 call write_units(dat_units, "# Fermi energy set to zero. Energies in eV") 5880 call write_units(dat_units, sjoin("# nkcalc:", itoa(gwr%nkcalc), ", nsppol:", itoa(gwr%nsppol))) 5881 5882 ! TODO: Improve file format. Add compatibility with gnuplot format for datasets? 5883 do spin=1,gwr%nsppol 5884 do ikcalc=1,gwr%nkcalc 5885 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 5886 call write_units(dat_units, sjoin("# kpt:", ktoa(gwr%kcalc(:, ikcalc)), "spin:", itoa(spin))) 5887 do band=gwr%bstart_ks(ikcalc, spin), gwr%bstop_ks(ikcalc, spin) 5888 ibc = band - gwr%bstart_ks(ikcalc, spin) + 1 5889 e0 = gwr%ks_ebands%eig(band, ik_ibz, spin) 5890 band2 = merge(1, band, gwr%sig_diago) 5891 sigx = gwr%sigx_mat(band, band2, ikcalc, spin) 5892 5893 call write_units(dat_units, sjoin("# band:", itoa(band), ", spin:", itoa(spin))) 5894 call write_units(dat_units, sjoin("# sigx_ev:", ftoa(sigx * Ha_eV))) 5895 5896 do itau=1,gwr%ntau 5897 ! FIXME itau is not ordered 5898 write(unt_it, "(*(es16.8))") & 5899 gwr%tau_mesh(itau), & 5900 c2r(sigc_it_mat(1, itau, band, band2, ikcalc, spin)), & 5901 c2r(sigc_it_mat(2, itau, band, band2, ikcalc, spin)) 5902 write(unt_iw, "(*(es16.8))") & 5903 gwr%iw_mesh(itau) * Ha_eV, & 5904 (c2r(gwr%sigc_iw_mat(itau, band, band2, ikcalc, spin) + sigx)) * Ha_eV 5905 end do 5906 5907 ! Write Sigma_xc(omega) and A(omega) 5908 rw_mesh = arth(e0 - gwr%wr_step * (gwr%nwr / 2), gwr%wr_step, gwr%nwr) * Ha_eV 5909 do iw=1,gwr%nwr 5910 write(unt_rw, "(*(es16.8))") & 5911 rw_mesh(iw), & 5912 c2r(sigxc_rw_diag(iw, band, ikcalc, spin)) * Ha_eV, & 5913 spfunc_diag(iw, band, ikcalc, spin) / Ha_eV 5914 end do 5915 end do 5916 end do 5917 end do 5918 5919 close(unt_it); close(unt_iw); close(unt_rw) 5920 5921 ! ====================== 5922 ! Add results to GWR.nc 5923 ! ====================== 5924 NCF_CHECK(nctk_open_modify(ncid, gwr%gwrnc_path, xmpi_comm_self)) 5925 5926 ! Define arrays with results. 5927 define = .True. 5928 if (define) then 5929 ncerr = nctk_def_arrays(ncid, [ & 5930 nctkarr_t("e0_kcalc", "dp", "smat_bsize1, nkcalc, nsppol"), & 5931 nctkarr_t("ze0_kcalc", "dp", "two, smat_bsize1, nkcalc, nsppol"), & 5932 nctkarr_t("qpz_ene", "dp", "two, smat_bsize1, nkcalc, nsppol"), & 5933 nctkarr_t("qp_pade", "dp", "two, smat_bsize1, nkcalc, nsppol"), & 5934 nctkarr_t("pade_solver_ierr", "int", "smat_bsize1, nkcalc, nsppol"), & 5935 nctkarr_t("ks_gaps", "dp", "nkcalc, nsppol"), & 5936 nctkarr_t("qpz_gaps", "dp", "nkcalc, nsppol"), & 5937 !nctkarr_t("qp_pade_gaps", "dp", "nkcalc, nsppol"), & 5938 nctkarr_t("sigx_mat", "dp", "two, smat_bsize1, smat_bsize2, nkcalc, nsppol"), & 5939 nctkarr_t("sigc_it_mat", "dp", "two, two, ntau, smat_bsize1, smat_bsize2, nkcalc, nsppol"), & 5940 nctkarr_t("sigc_iw_mat", "dp", "two, ntau, smat_bsize1, smat_bsize2, nkcalc, nsppol"), & 5941 nctkarr_t("sigxc_rw_diag", "dp", "two, nwr, smat_bsize1, nkcalc, nsppol"), & 5942 nctkarr_t("spfunc_diag", "dp", "nwr, smat_bsize1, nkcalc, nsppol") & 5943 ]) 5944 NCF_CHECK(ncerr) 5945 end if 5946 5947 ! Write data. 5948 NCF_CHECK(nctk_set_datamode(ncid)) 5949 NCF_CHECK(nf90_put_var(ncid, vid("e0_kcalc"), e0_kcalc)) 5950 NCF_CHECK(nf90_put_var(ncid, vid("ze0_kcalc"), c2r(ze0_kcalc))) 5951 NCF_CHECK(nf90_put_var(ncid, vid("sigx_mat"), c2r(gwr%sigx_mat))) 5952 NCF_CHECK(nf90_put_var(ncid, vid("qpz_ene"), c2r(qpz_ene))) 5953 NCF_CHECK(nf90_put_var(ncid, vid("qp_pade"), c2r(qp_pade))) 5954 NCF_CHECK(nf90_put_var(ncid, vid("pade_solver_ierr"), pade_solver_ierr)) 5955 NCF_CHECK(nf90_put_var(ncid, vid("ks_gaps"), ks_gaps)) 5956 NCF_CHECK(nf90_put_var(ncid, vid("qpz_gaps"), qpz_gaps)) 5957 !NCF_CHECK(nf90_put_var(ncid, vid("qp_pade_gaps"), qp_pade_gaps)) 5958 NCF_CHECK(nf90_put_var(ncid, vid("sigc_it_mat"), c2r(sigc_it_mat))) 5959 NCF_CHECK(nf90_put_var(ncid, vid("sigc_iw_mat"), c2r(gwr%sigc_iw_mat))) 5960 NCF_CHECK(nf90_put_var(ncid, vid("sigxc_rw_diag"), c2r(sigxc_rw_diag))) 5961 NCF_CHECK(nf90_put_var(ncid, vid("spfunc_diag"), spfunc_diag)) 5962 NCF_CHECK(nf90_close(ncid)) 5963 end if ! master 5964 5965 ABI_FREE(sigc_it_mat) 5966 !ABI_FREE(sigc_iw_mat) 5967 5968 call cwtime_report(" gwr_build_sigmac:", cpu_all, wall_all, gflops_all) 5969 call timab(1925, 2, tsec) 5970 5971 contains 5972 integer function vid(vname) 5973 character(len=*),intent(in) :: vname 5974 vid = nctk_idname(ncid, vname) 5975 end function vid 5976 5977 subroutine print_sigma_header() 5978 5979 if (gwr%comm%me /= 0) return 5980 if (gwr%use_supercell_for_sigma) then 5981 call wrtout(std_out, sjoin(" Building Sigma_c in the supercell with FFT mesh:", ltoa(sc_ngfft(1:3))), pre_newlines=2) 5982 else 5983 call wrtout([std_out,ab_out], " Building Sigma_c with convolutions in k-space:", pre_newlines=2) 5984 end if 5985 call wrtout(std_out, sjoin(" gwr_np_kgts:", ltoa(gwr%dtset%gwr_np_kgts))) 5986 call wrtout(std_out, sjoin(" ngkpt:", ltoa(gwr%ngkpt), " ngqpt:", ltoa(gwr%ngqpt))) 5987 call wrtout(std_out, sjoin(" gwr_boxcutmin:", ftoa(gwr%dtset%gwr_boxcutmin))) 5988 call wrtout(std_out, sjoin(" my_ntau:", itoa(gwr%my_ntau), "ntau:", itoa(gwr%ntau))) 5989 call wrtout(std_out, sjoin(" my_nkbz:", itoa(gwr%my_nkbz), "nkibz:", itoa(gwr%nkibz))) 5990 call wrtout(std_out, sjoin("- FFT uc_batch_size:", itoa(gwr%uc_batch_size))) 5991 call wrtout(std_out, sjoin("- FFT sc_batch_size:", itoa(gwr%sc_batch_size)), do_flush=.True.) 5992 5993 end subroutine print_sigma_header 5994 5995 #endif 5996 end subroutine gwr_build_sigmac
m_gwr/gwr_build_sigxme [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_sigxme
FUNCTION
Compute matrix elements of the exchange part.
INPUTS
OUTPUT
SOURCE
7345 subroutine gwr_build_sigxme(gwr, compute_qp) 7346 7347 !Arguments ------------------------------------ 7348 class(gwr_t),target,intent(inout) :: gwr 7349 logical,optional,intent(in) :: compute_qp 7350 7351 !Local variables------------------------------- 7352 !scalars 7353 integer :: nsppol, nspinor, ierr, my_ikf, band_sum, ii, jj, kb, il_b, iab !ig_start, ig, 7354 integer :: my_is, ikcalc, ikcalc_ibz, bmin, bmax, band, istwf_k, npw_k 7355 integer :: spin, jb, is_idx, use_umklp 7356 integer :: spad, wtqm, wtqp, irow, spadx1, spadx2 7357 integer :: npwx, u_nfft, u_mgfft, u_mpw 7358 integer :: ik_bz, ik_ibz, isym_k, trev_k, g0_k(3) 7359 integer :: iq_bz, iq_ibz, isym_q, trev_q, g0_q(3) 7360 logical :: isirr_k, isirr_q, sigc_is_herm, compute_qp__ 7361 real(dp) :: fact_spin, theta_mu_minus_esum, theta_mu_minus_esum2, tol_empty, tol_empty_in, gwr_boxcutmin_x 7362 real(dp) :: cpu_k, wall_k, gflops_k, cpu_all, wall_all, gflops_all 7363 character(len=5000) :: msg 7364 logical :: q_is_gamma 7365 type(__slkmat_t),pointer :: ugb_kibz 7366 type(crystal_t),pointer :: cryst 7367 type(dataset_type),pointer :: dtset 7368 type(littlegroup_t) :: ltg_k 7369 type(desc_t),pointer :: desc_ki 7370 !arrays 7371 integer :: g0(3), gmax(3), spinor_padx(2,4), u_ngfft(18), work_ngfft(18), units(2) 7372 integer,allocatable :: gbound_kcalc(:,:), gvec_x(:,:), gbound_x(:,:), kg_k(:,:), gbound_ksum(:,:) 7373 real(dp) :: ksum(3), kk_ibz(3), kgw(3), kgw_m_ksum(3), qq_bz(3), tsec(2) !, kk_bz(3), q0(3) !, spinrot_kbz(4), spinrot_kgw(4) 7374 real(dp),contiguous, pointer :: ks_eig(:,:,:), qp_eig(:,:,:), qp_occ(:,:,:), cg2_ptr(:,:) ! cg1_ptr(:,:), 7375 real(dp),allocatable :: work(:,:,:,:), cg1_ibz(:,:) !, cg2_bz(:,:) 7376 complex(gwpc),allocatable :: vc_sqrt_qbz(:) 7377 complex(dp),allocatable :: rhotwg(:), rhotwgp(:), rhotwg_ki(:,:) 7378 complex(gwpc),allocatable :: ur_bdgw(:,:) 7379 complex(dp),allocatable :: ur_ksum(:), ur_prod(:), eig0r(:) 7380 complex(dp),target,allocatable :: ug_ksum(:) 7381 complex(dp),allocatable :: sigxcme_tmp(:,:), sigxme_tmp(:,:,:), sigx(:,:,:,:) 7382 complex(dp) :: gwpc_sigxme, gwpc_sigxme2, xdot_tmp 7383 type(sigijtab_t),allocatable :: Sigxij_tab(:,:), Sigcij_tab(:,:) 7384 7385 ! ************************************************************************* 7386 7387 call timab(1920, 1, tsec) 7388 call cwtime(cpu_all, wall_all, gflops_all, "start") 7389 units = [std_out, ab_out] 7390 7391 nsppol = gwr%nsppol; nspinor = gwr%nspinor; cryst => gwr%cryst; dtset => gwr%dtset 7392 7393 ! Table for \Sigmax_ij matrix elements. 7394 sigc_is_herm = .False. 7395 call sigtk_sigma_tables(gwr%nkcalc, gwr%nkibz, gwr%nsppol, gwr%bstart_ks, gwr%bstop_ks, gwr%kcalc2ibz(:,1), & 7396 gwr%sig_diago, sigc_is_herm, sigxij_tab, sigcij_tab) 7397 7398 call sigijtab_free(Sigcij_tab) 7399 ABI_FREE(Sigcij_tab) 7400 7401 if (gwr%sig_diago) then 7402 call wrtout(units, " Computing diagonal matrix elements of Sigma_x", pre_newlines=1) 7403 else 7404 call wrtout(units, " Computing diagonal + off-diagonal matrix elements of Sigma_x", pre_newlines=1) 7405 end if 7406 7407 ! Allocate array with Sigma_x matrix elements depending on sig_diago 7408 ii = gwr%b1gw; jj = gwr%b2gw 7409 if (gwr%sig_diago) then 7410 ii = 1; jj = 1 7411 end if 7412 ABI_RECALLOC(gwr%sigx_mat, (gwr%b1gw:gwr%b2gw, ii:jj, gwr%nkcalc, gwr%nsppol*gwr%nsig_ab)) 7413 7414 ks_eig => gwr%ks_ebands%eig 7415 if (gwr%scf_iteration == 1) then 7416 call wrtout(units, " Using KS orbitals and KS energies...", newlines=1, do_flush=.True.) 7417 qp_eig => gwr%ks_ebands%eig; qp_occ => gwr%ks_ebands%occ 7418 else 7419 call wrtout(units, " Using KS orbitals and QP energies...", newlines=1, do_flush=.True.) 7420 qp_eig => gwr%qp_ebands%eig; qp_occ => gwr%qp_ebands%occ 7421 end if 7422 7423 ! MRM allow lower occ numbers 7424 ! Normalization of theta_mu_minus_esum. If nsppol==2, qp_occ $\in [0,1]$ 7425 tol_empty_in = 0.01 ! Initialize the tolerance used to decide if a band is empty (passed to m_sigx.F90) 7426 select case (nsppol) 7427 case (1) 7428 fact_spin = half; tol_empty = tol_empty_in ! below this value the state is assumed empty 7429 if (nspinor == 2) then 7430 fact_spin = one; tol_empty = half * tol_empty_in ! below this value the state is assumed empty 7431 end if 7432 case (2) 7433 fact_spin = one; tol_empty = half * tol_empty_in ! to be consistent and obtain similar results if a metallic 7434 case default ! spin unpolarized system is treated using nsppol==2 7435 ABI_BUG(sjoin('Wrong nsppol:', itoa(nsppol))) 7436 end select 7437 7438 ! ========================================= 7439 ! Find FFT mesh and max number of g-vectors 7440 ! ========================================= 7441 gwr_boxcutmin_x = two 7442 call gwr%get_u_ngfft(gwr_boxcutmin_x, u_ngfft, u_nfft, u_mgfft, u_mpw, gmax) 7443 7444 if (gwr%comm%me == 0) then 7445 call print_ngfft(u_ngfft, header="FFT mesh for Sigma_x", unit=std_out) 7446 !call print_ngfft(u_ngfft, header="FFT mesh for Sigma_x", unit=ab_out) 7447 end if 7448 7449 ! Init work_ngfft 7450 gmax = gmax + 4 ! FIXME: this is to account for umklapp, should also consider Gamma-only and istwfk 7451 gmax = 2 * gmax + 1 7452 call ngfft_seq(work_ngfft, gmax) 7453 !write(std_out,*)"work_ngfft(1:3): ",work_ngfft(1:3) 7454 ABI_MALLOC(work, (2, work_ngfft(4), work_ngfft(5), work_ngfft(6))) 7455 7456 do my_is=1,gwr%my_nspins 7457 spin = gwr%my_spins(my_is) 7458 do ikcalc=1,gwr%nkcalc ! TODO: Should be spin dependent! 7459 call cwtime(cpu_k, wall_k, gflops_k, "start") 7460 ikcalc_ibz = gwr%kcalc2ibz(ikcalc, 1) 7461 kgw = gwr%kcalc(:, ikcalc) 7462 bmin = gwr%bstart_ks(ikcalc, spin); bmax = gwr%bstop_ks(ikcalc, spin) 7463 7464 ! ============================================================== 7465 ! ==== Find little group of the k-points for GW corrections ==== 7466 ! ============================================================== 7467 ! * The little group is used only if symsigma == 1 7468 ! * If use_umklp == 1 then symmetries requiring an umklapp to preserve k_gw are included as well. 7469 use_umklp = 1 7470 call ltg_k%init(kgw, gwr%nqbz, gwr%qbz, cryst, use_umklp, npwe=0) 7471 7472 write(msg,'(5a)') ch10, & 7473 ' Calculating <nk|Sigma_x|nk> at k: ',trim(ktoa(kgw)), ", for band range: ", trim(ltoa([bmin, bmax])) 7474 call wrtout(std_out, msg) 7475 7476 ! =============================================== 7477 ! Load wavefunctions for Sigma_x matrix elements 7478 ! =============================================== 7479 ! All procs need ur_bdgw but the IBZ is distributed and, possibly, replicated in gwr%kpt_comm. 7480 ! Here we select the right procs, fill the buffer with the FFT results and then use 7481 ! a dumb xmpi_sum + rescaling to gather the results. 7482 ! FIXME: g-vectors from Green's descriptor or use another array to be able to deal with istwfk == 2? 7483 7484 ABI_MALLOC_OR_DIE(ur_bdgw, (u_nfft * nspinor, bmin:bmax), ierr) 7485 ur_bdgw = czero_gw 7486 7487 if (any(ikcalc_ibz == gwr%my_kibz_inds)) then 7488 associate (desc_kcalc => gwr%green_desc_kibz(ikcalc_ibz), ugb_kcalc => gwr%ugb(ikcalc_ibz, spin)) 7489 ABI_MALLOC(gbound_kcalc, (2 * u_mgfft + 8, 2)) 7490 call sphereboundary(gbound_kcalc, desc_kcalc%istwfk, desc_kcalc%gvec, u_mgfft, desc_kcalc%npw) 7491 7492 do il_b=1,ugb_kcalc%sizeb_local(2) 7493 band = ugb_kcalc%loc2gcol(il_b); if (band < bmin .or. band > bmax) CYCLE 7494 call fft_ug(desc_kcalc%npw, u_nfft, nspinor, ndat1, & 7495 u_mgfft, u_ngfft, desc_kcalc%istwfk, desc_kcalc%gvec, gbound_kcalc, & 7496 gwr%ugb(ikcalc_ibz, spin)%buffer_cplx(:, il_b), & ! in 7497 ur_bdgw(:, band)) ! out 7498 end do 7499 ABI_FREE(gbound_kcalc) 7500 end associate 7501 end if 7502 7503 ! Collect and rescale 7504 !call xmpi_sum(ur_bdgw, gwr%kgt_comm%value, ierr) 7505 call xmpi_sum(ur_bdgw, gwr%kg_comm%value, ierr) 7506 ur_bdgw = ur_bdgw / gwr%np_kibz(ikcalc_ibz) 7507 7508 ABI_MALLOC(ur_prod, (u_nfft * nspinor)) 7509 ABI_MALLOC(ur_ksum, (u_nfft * nspinor)) 7510 ABI_MALLOC(eig0r, (u_nfft * nspinor)) 7511 7512 ABI_CALLOC(sigxme_tmp, (bmin:bmax, bmin:bmax, nsppol * gwr%nsig_ab)) 7513 ABI_CALLOC(sigxcme_tmp, (bmin:bmax, nsppol * gwr%nsig_ab)) 7514 ABI_CALLOC(sigx, (2, bmin:bmax, bmin:bmax, nsppol * gwr%nsig_ab)) 7515 7516 ! ======================================== 7517 ! ==== Sum over my k-points in the BZ ==== 7518 ! ======================================== 7519 7520 do my_ikf=1,gwr%my_nkbz 7521 ik_bz = gwr%my_kbz_inds(my_ikf) 7522 ksum = gwr%kbz(:, ik_bz) 7523 7524 ! Find the symmetrical image of ksum in the IBZ 7525 !call kmesh%get_BZ_item(ik_bz, ksum, ik_ibz, isym_ki, iik, ph_mkt) 7526 7527 ! FIXME: Be careful with the symmetry conventions here and the interplay between umklapp in q and FFT 7528 ik_ibz = gwr%kbz2ibz_symrel(1, ik_bz); isym_k = gwr%kbz2ibz_symrel(2, ik_bz) 7529 trev_k = gwr%kbz2ibz_symrel(6, ik_bz); g0_k = gwr%kbz2ibz_symrel(3:5, ik_bz) 7530 isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 7531 kk_ibz = gwr%kibz(:, ik_ibz) 7532 7533 ! Identify q and G0 where q + G0 = k_GW - ksum 7534 kgw_m_ksum = kgw - ksum 7535 call findqg0(iq_bz, g0, kgw_m_ksum, gwr%nqbz, gwr%qbz, gwr%mG0) 7536 !ABI_CHECK(all(g0 == 0), sjoin("g0 = ", ltoa(g0))) 7537 7538 call calc_ceigr(g0, u_nfft, nspinor, u_ngfft, eig0r) 7539 7540 ! If symmetries are exploited, only q-points in the IBZ_k are computed. 7541 ! In this case elements are weighted according to wtqp and wtqm. wtqm is for time-reversal. 7542 wtqp = 1; wtqm = 0 7543 !if (can_symmetrize(spin)) then 7544 if (gwr%dtset%symsigma == 1) then 7545 if (ltg_k%ibzq(iq_bz) /= 1) CYCLE 7546 wtqp = sum(ltg_k%wtksym(1,:,iq_bz)) 7547 wtqm = sum(ltg_k%wtksym(2,:,iq_bz)) 7548 end if 7549 7550 qq_bz = gwr%qbz(:, iq_bz) 7551 iq_ibz = gwr%qbz2ibz(1, iq_bz); isym_q = gwr%qbz2ibz(2, iq_bz) 7552 trev_q = gwr%qbz2ibz(6, iq_bz); g0_q = gwr%qbz2ibz(3:5, iq_bz) 7553 isirr_q = (isym_q == 1 .and. trev_q == 0 .and. all(g0_q == 0)) 7554 7555 ! Find the corresponding irreducible q-point. 7556 ! NB: non-zero umklapp G_o is not allowed. There's a check in setup_sigma 7557 !call qmesh%get_BZ_item(iq_bz, qbz, iq_ibz, isym_q, itim_q) 7558 q_is_gamma = normv(qq_bz, cryst%gmet, "G") < GW_TOLQ0 7559 call get_kg(qq_bz, istwfk1, dtset%ecutsigx, cryst%gmet, npwx, gvec_x) 7560 7561 ABI_MALLOC(gbound_x, (2*u_mgfft + 8, 2)) 7562 call sphereboundary(gbound_x, istwfk1, gvec_x, u_mgfft, npwx) 7563 7564 ! Tables for the FFT of the oscillators. 7565 ! a) FFT index of G-G0. 7566 ! b) x_gbound table for the zero-padded FFT performed in rhotwg. 7567 !ABI_MALLOC(x_gbound, (2*u_mgfft+8, 2)) 7568 !call Gsph_x%fft_tabs(g0, u_mgfft, u_ngfft, use_padfft, x_gbound, igfftxg0) 7569 7570 ABI_MALLOC(rhotwg_ki, (npwx * nspinor, bmin:bmax)) 7571 ABI_MALLOC(rhotwg, (npwx * nspinor)) 7572 ABI_MALLOC(rhotwgp, (npwx * nspinor)) 7573 ABI_MALLOC(vc_sqrt_qbz, (npwx)) 7574 spinor_padx = reshape([0, 0, npwx, npwx, 0, npwx, npwx, 0], [2, 4]) 7575 7576 ! Get Fourier components of the Coulomb interaction in the BZ 7577 ! In 3D systems, neglecting umklapp, vc(Sq,sG)=vc(q,G)=4pi/|q+G| 7578 ! The same relation holds for 0-D systems, but not in 1-D or 2D systems. It depends on S. 7579 call gwr%vcgen%get_vc_sqrt(qq_bz, npwx, gvec_x, gwr%q0, gwr%cryst, vc_sqrt_qbz, gwr%gtau_comm%value) 7580 7581 desc_ki => gwr%green_desc_kibz(ik_ibz) 7582 7583 ! Get npw_k and kg_k for this k. 7584 if (isirr_k) then 7585 istwf_k = desc_ki%istwfk; npw_k = desc_ki%npw 7586 ABI_MALLOC(kg_k, (3, npw_k)) 7587 kg_k(:,:) = desc_ki%gvec 7588 else 7589 istwf_k = 1 7590 call get_kg(ksum, istwf_k, dtset%ecut, cryst%gmet, npw_k, kg_k) 7591 end if 7592 7593 ABI_MALLOC(ug_ksum, (npw_k * nspinor)) 7594 ABI_MALLOC(cg1_ibz, (2, desc_ki%npw * nspinor)) 7595 !ABI_MALLOC(cg2_bz, (2, npw_k * nspinor)) 7596 7597 ABI_MALLOC(gbound_ksum, (2*u_mgfft+8, 2)) 7598 call sphereboundary(gbound_ksum, istwf_k, kg_k, u_mgfft, npw_k) 7599 7600 ! ========================== 7601 ! Sum over (occupied) bands 7602 ! ========================== 7603 ugb_kibz => gwr%ugb(ik_ibz, spin) 7604 7605 do il_b=1,ugb_kibz%sizeb_local(2) 7606 ! Distribute bands inside tau_comm as wavefunctions are replicated 7607 if (gwr%tau_comm%skip(il_b)) cycle 7608 band_sum = ugb_kibz%loc2gcol(il_b) 7609 7610 ! Skip empty states. MRM: allow negative occ numbers. 7611 if (abs(qp_occ(band_sum, ik_ibz, spin)) < tol_empty) CYCLE 7612 7613 !call wfd%get_ur(band_sum, ik_ibz, spin, ur_ibz) 7614 7615 ! Compute ur_ksum(r) from the symmetrical image. 7616 ! I should rotate the g-vectors outside the loop and rotate ug here 7617 ! but at present I cannot use cgtk_rotate due to the symrel^T convention. 7618 7619 if (isirr_k) then 7620 !call wfd%copy_cg(ibsum_kq, ikq_ibz, spin, bra_kq) 7621 ug_ksum(:) = ugb_kibz%buffer_cplx(:, il_b) 7622 else 7623 ! Reconstruct u_kq(G) from the IBZ image. 7624 !call wfd%copy_cg(ibsum_kq, ikq_ibz, spin, cgwork) 7625 7626 ! FIXME: This is wrong if spc 7627 call c_f_pointer(c_loc(ug_ksum), cg2_ptr, shape=[2, npw_k * nspinor]) 7628 7629 !call c_f_pointer(c_loc(ugb_kibz%buffer_cplx(:, il_b)), cg1_ptr, shape=[2, desc_ki%npw * nspinor]) 7630 !call cgtk_rotate(cryst, kk_ibz, isym_k, trev_k, g0_k, nspinor, ndat1, & 7631 ! desc_ki%npw, desc_ki%gvec, & 7632 ! npw_k, kg_k, desc_ki%istwfk, istwf_k, cg1_ptr, cg2_ptr, work_ngfft, work) 7633 7634 cg1_ibz(1,:) = real(ugb_kibz%buffer_cplx(:, il_b)) 7635 cg1_ibz(2,:) = aimag(ugb_kibz%buffer_cplx(:, il_b)) 7636 call cgtk_rotate(cryst, kk_ibz, isym_k, trev_k, g0_k, nspinor, ndat1, & 7637 desc_ki%npw, desc_ki%gvec, & 7638 npw_k, kg_k, desc_ki%istwfk, istwf_k, cg1_ibz, cg2_ptr, work_ngfft, work) 7639 end if 7640 7641 call fft_ug(npw_k, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwf_k, kg_k, gbound_ksum, & 7642 ug_ksum, ur_ksum) 7643 7644 if (any(g0 /= 0)) ur_ksum = ur_ksum * conjg(eig0r) 7645 7646 ! Get all <k-q,band_sum,s|e^{-i(q+G).r}|s,jb,k> 7647 do jb=bmin,bmax 7648 7649 ! FIXME: nspinor 2 is wrong as we have a 2x2 matrix 7650 ur_prod(:) = conjg(ur_ksum(:)) * ur_bdgw(:,jb) 7651 call fft_ur(npwx, u_nfft, nspinor, ndat1, u_mgfft, u_ngfft, istwfk1, gvec_x, gbound_x, & 7652 ur_prod, rhotwg_ki(:,jb)) 7653 7654 ! Multiply by the square root of the Coulomb term 7655 ! In 3-D systems, the factor sqrt(4pi) is included 7656 do ii=1,nspinor 7657 spad = (ii-1) * npwx 7658 rhotwg_ki(spad+1:spad+npwx,jb) = rhotwg_ki(spad+1:spad + npwx,jb) * vc_sqrt_qbz(1:npwx) 7659 end do 7660 7661 if (q_is_gamma) then 7662 !if (ik_bz == jk_bz) then 7663 ! Treat analytically the case q --> 0: 7664 ! 7665 ! * The oscillator is evaluated at q = 0 as it is considered constant in the small cube around Gamma 7666 ! while the Colulomb term is integrated out. 7667 ! * If nspinor == 1, we have nonzero contribution only if band_sum == jb 7668 ! * If nspinor == 2, we evaluate <band_sum,up|jb,up> and <band_sum,dwn|jb,dwn>, 7669 ! and impose orthonormalization since npwwfn might be < npwvec. 7670 ! * Note the use of i_sz_resid and not i_sz, to account for the possibility 7671 ! to have generalized KS basis set from hybrid 7672 7673 if (nspinor == 1) then 7674 rhotwg_ki(1, jb) = czero_gw 7675 if (band_sum == jb) rhotwg_ki(1,jb) = cmplx(sqrt(gwr%vcgen%i_sz), 0.0_gwp) 7676 !rhotwg_ki(1,jb) = czero_gw ! DEBUG 7677 7678 else 7679 !ABI_ERROR("Not implemented Error") 7680 rhotwg_ki(1, jb) = zero; rhotwg_ki(npwx+1, jb) = zero 7681 if (band_sum == jb) then 7682 !ABI_CHECK(wfd%get_wave_ptr(band_sum, ik_ibz, spin, wave_sum, msg) == 0, msg) 7683 !cg_sum => wave_sum%ug 7684 !ABI_CHECK(wfd%get_wave_ptr(jb, jk_ibz, spin, wave_jb, msg) == 0, msg) 7685 !cg_jb => wave_jb%ug 7686 !ctmp = xdotc(npw_k, cg_sum(1:), 1, cg_jb(1:), 1) 7687 rhotwg_ki(1, jb) = cmplx(sqrt(gwr%vcgen%i_sz), 0.0_gwp) !* real(ctmp) 7688 !ctmp = xdotc(npw_k, cg_sum(npw_k+1:), 1, cg_jb(npw_k+1:), 1) 7689 rhotwg_ki(npwx+1, jb) = cmplx(sqrt(gwr%vcgen%i_sz), 0.0_gwp) ! * real(ctmp) 7690 end if 7691 !!!rhotwg_ki(1, jb) = zero; rhotwg_ki(npwx+1, jb) = zero 7692 !!! PAW is missing 7693 end if 7694 end if 7695 7696 end do ! jb Got all matrix elements from bmin up to bmax. 7697 7698 theta_mu_minus_esum = fact_spin * qp_occ(band_sum, ik_ibz, spin) 7699 theta_mu_minus_esum2 = sqrt(abs(fact_spin * qp_occ(band_sum, ik_ibz, spin))) ! MBB Nat. orb. funct. approx. sqrt(occ) 7700 7701 if (abs(theta_mu_minus_esum / fact_spin) >= tol_empty) then ! MRM: allow negative occ numbers 7702 do kb=bmin,bmax 7703 7704 ! Copy the ket Sigma_x |phi_{k,kb}>. 7705 rhotwgp(:) = rhotwg_ki(:, kb) 7706 7707 ! Loop over the non-zero row elements of this column. 7708 ! If gwcalctyp < 20: only diagonal elements since QP == KS. 7709 ! If gwcalctyp >= 20: 7710 ! * Only off-diagonal elements connecting states with same character. 7711 ! * Only the upper triangle if HF, SEX, or COHSEX. 7712 7713 do irow=1,Sigxij_tab(ikcalc, spin)%col(kb)%size1 7714 jb = Sigxij_tab(ikcalc, spin)%col(kb)%bidx(irow) 7715 rhotwg(:) = rhotwg_ki(:,jb) 7716 7717 ! Calculate bare exchange <phi_jb|Sigma_x|phi_kb>. 7718 ! Do the scalar product only if band_sum is occupied. 7719 do iab=1,gwr%nsig_ab 7720 spadx1 = spinor_padx(1, iab); spadx2 = spinor_padx(2, iab) 7721 xdot_tmp = -XDOTC(npwx, rhotwg(spadx1+1:), 1, rhotwgp(spadx2+1:), 1) 7722 gwpc_sigxme = xdot_tmp * theta_mu_minus_esum 7723 gwpc_sigxme2 = xdot_tmp * theta_mu_minus_esum2 7724 7725 ! Accumulate and symmetrize Sigma_x matrix elements. 7726 ! -wtqm comes from time-reversal (exchange of band indices) 7727 is_idx = spin; if (nspinor == 2) is_idx = iab 7728 sigxme_tmp(jb, kb, is_idx) = sigxme_tmp(jb, kb, is_idx) + & 7729 (wtqp + wtqm) * DBLE(gwpc_sigxme) + (wtqp - wtqm) * j_dpc * AIMAG(gwpc_sigxme) 7730 if (jb == kb) then 7731 sigxcme_tmp(jb, is_idx) = sigxcme_tmp(jb, is_idx) + & 7732 (wtqp + wtqm) * DBLE(gwpc_sigxme2) + (wtqp - wtqm) *j_dpc * AIMAG(gwpc_sigxme2) 7733 end if 7734 7735 sigx(1, jb, kb, is_idx) = sigx(1, jb, kb, is_idx) + wtqp * gwpc_sigxme 7736 sigx(2, jb, kb, is_idx) = sigx(2, jb, kb, is_idx) + wtqm *CONJG(gwpc_sigxme) 7737 end do 7738 end do ! irow 7739 7740 end do ! kb 7741 end if 7742 end do ! band_sum 7743 7744 ABI_FREE(gbound_x) 7745 ABI_FREE(kg_k) 7746 ABI_FREE(ug_ksum) 7747 ABI_FREE(cg1_ibz) 7748 !ABI_FREE(cg2_bz) 7749 ABI_FREE(gbound_ksum) 7750 ABI_FREE(gvec_x) 7751 ABI_FREE(rhotwg_ki) 7752 ABI_FREE(rhotwg) 7753 ABI_FREE(rhotwgp) 7754 ABI_FREE(vc_sqrt_qbz) 7755 end do ! my_ikf Got all diagonal (off-diagonal) matrix elements. 7756 7757 ! Gather contributions from all the CPUs. 7758 call xmpi_sum(sigxme_tmp, gwr%kgt_comm%value, ierr) 7759 call xmpi_sum(sigxcme_tmp, gwr%kgt_comm%value, ierr) 7760 call xmpi_sum(sigx, gwr%kgt_comm%value, ierr) 7761 7762 ! Multiply by constants. For 3D systems sqrt(4pi) is included in vc_sqrt_qbz. 7763 sigxme_tmp = (one / (cryst%ucvol * gwr%nkbz)) * sigxme_tmp ! * Sigp%sigma_mixing 7764 sigxcme_tmp = (one / (cryst%ucvol * gwr%nkbz)) * sigxcme_tmp ! * Sigp%sigma_mixing 7765 sigx = (one / (cryst%ucvol * gwr%nkbz)) * sigx ! * Sigp%sigma_mixing 7766 7767 ! If we have summed over the IBZ_q, we have to average over degenerate states. 7768 ! Presently only diagonal terms are considered 7769 ! Note that here we pass ks_eig to sigx_symmetrize instead of qp_eig. 7770 ! The reason is that we use the eigenvalues to detect degeneracies before averaging 7771 ! and qp_eig may break degeneracies while ks_eig are much more accurate. 7772 ! Most of the breaking comes from the correlated part, likey due to the treatment of q --> 0. 7773 7774 ! TODO QP-SCGW required a more involved approach, there is a check in sigma 7775 ! TODO it does not work if nspinor == 2. 7776 7777 if (gwr%dtset%symsigma == 1) then 7778 call sigx_symmetrize(ikcalc_ibz, spin, bmin, bmax, nsppol, nspinor, gwr%nsig_ab, ks_eig, sigx, sigxme_tmp) 7779 !do ii=bmin, bmax; print *, "qp_eig:", ii, qp_eig(ii, ikcalc_ibz, spin) * Ha_eV; end do 7780 !call sigx_symmetrize(ikcalc_ibz, spin, bmin, bmax, nsppol, nspinor, gwr%nsig_ab, qp_eig, sigx, sigxme_tmp) 7781 end if 7782 7783 ! Reconstruct the full sigma_x matrix from the upper triangle. 7784 if (gwr%nsig_ab == 1) then 7785 call hermitianize(sigxme_tmp(:,:,spin), "Upper") 7786 else 7787 ABI_WARNING("Should hermitianize non-collinear sigma!") 7788 end if 7789 7790 ! Save exchange matrix in gwr%sigx_mat taking into account sig_diago. 7791 if (gwr%nsig_ab == 1) then 7792 if (gwr%sig_diago) then 7793 do jb=bmin,bmax 7794 gwr%sigx_mat(jb, 1, ikcalc, spin) = sigxme_tmp(jb,jb,spin) 7795 end do 7796 else 7797 gwr%sigx_mat(bmin:bmax, bmin:bmax, ikcalc, spin) = sigxme_tmp(bmin:bmax, bmin:bmax, spin) 7798 end if 7799 else 7800 if (gwr%sig_diago) then 7801 do iab=1,gwr%nsig_ab 7802 do jb=bmin,bmax 7803 gwr%sigx_mat(jb, 1, ikcalc, iab) = sigxme_tmp(jb,jb,iab) 7804 end do 7805 end do 7806 else 7807 gwr%sigx_mat(bmin:bmax, bmin:bmax, ikcalc, :) = sigxme_tmp(bmin:bmax, bmin:bmax, :) 7808 end if 7809 end if 7810 7811 ABI_FREE(ur_bdgw) 7812 ABI_FREE(ur_prod) 7813 ABI_FREE(ur_ksum) 7814 ABI_FREE(eig0r) 7815 ABI_FREE(sigxme_tmp) 7816 ABI_FREE(sigxcme_tmp) 7817 ABI_FREE(sigx) 7818 call ltg_k%free() 7819 call cwtime_report(" Sigx_nk:", cpu_k, wall_k, gflops_k) 7820 end do ! ikcalc 7821 end do ! my_is 7822 7823 if (gwr%spin_comm%nproc > 1) call xmpi_sum(gwr%sigx_mat, gwr%spin_comm%value, ierr) 7824 7825 ABI_FREE(work) 7826 call sigijtab_free(Sigxij_tab) 7827 ABI_FREE(Sigxij_tab) 7828 7829 ! Compute QP results. Done usually when gwr_task == G0v i.e. Hartree-Fock with KS states. 7830 compute_qp__ = .False.; if (present(compute_qp)) compute_qp__ = compute_qp 7831 if (compute_qp__ .and. gwr%comm%me == 0) then 7832 call write_notations(units) 7833 ! TODO 7834 end if 7835 7836 call cwtime_report(" gwr_build_sigxme:", cpu_all, wall_all, gflops_all) 7837 call timab(1920, 2, tsec) 7838 7839 end subroutine gwr_build_sigxme
m_gwr/gwr_build_tchi [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_tchi
FUNCTION
High-level routine to compute the irreducible polarizability.
SOURCE
4043 subroutine gwr_build_tchi(gwr) 4044 4045 !Arguments ------------------------------------ 4046 class(gwr_t),target,intent(inout) :: gwr 4047 4048 !Local variables------------------------------- 4049 !scalars 4050 integer :: my_is, my_it, my_ikf, ig, my_ir, my_nr, nrsp, npwsp, ncol_glob, col_bsize, my_iqi, gt_scbox_win 4051 integer :: idat, ndat, max_ndat, sc_nfft, sc_nfftsp, spin, ik_bz, iq_ibz, ikq_ibz, ikq_bz, ierr, ipm, itau, ig2 !, ii 4052 integer :: use_umklp ! ik_ibz, isym_k, trev_k, tsign_k, ! g0_k(3), 4053 integer(kind=XMPI_ADDRESS_KIND) :: buf_count 4054 real(dp) :: cpu_tau, wall_tau, gflops_tau, cpu_all, wall_all, gflops_all, cpu_ir, wall_ir, gflops_ir 4055 real(dp) :: cpu_ikf, wall_ikf, gflops_ikf 4056 real(dp) :: tchi_rfact, mem_mb, local_max, max_abs_imag_chit, wtqp, wtqm 4057 complex(gwpc) :: head_q 4058 complex(dp) :: chq(3), wng(3) 4059 logical :: q_is_gamma, use_shmem_for_k, use_mpi_for_k, print_time ! isirr_k, 4060 character(len=5000) :: msg 4061 type(desc_t),pointer :: desc_q ! desc_k, 4062 type(__slkmat_t) :: chi_rgp 4063 type(c_ptr) :: void_ptr 4064 !arrays 4065 integer :: sc_ngfft(18), gg(3), g0_kq(3), mask_qibz(gwr%nqibz), need_kibz(gwr%nkibz), got_kibz(gwr%nkibz) 4066 integer,allocatable :: green_scgvec(:,:), chi_scgvec(:,:) 4067 real(dp) :: kk_bz(3), kpq_bz(3), qq_ibz(3), tsec(2) 4068 complex(gwpc) ABI_ASYNC, contiguous, pointer :: gt_scbox(:,:,:) 4069 complex(gwpc),allocatable :: low_wing_q(:), up_wing_q(:), cemiqr(:) 4070 type(__slkmat_t) :: gkq_rpr_pm(2), gk_rpr_pm(2) 4071 type(__slkmat_t),allocatable :: gt_gpr(:,:), chiq_gpr(:), chiq_rpr(:) 4072 type(desc_t),target,allocatable :: desc_mykbz(:) 4073 type(littlegroup_t),allocatable :: ltg_qibz(:) 4074 type(fftbox_plan3_t) :: green_plan 4075 type(uplan_t) :: uplan_q 4076 4077 ! ************************************************************************* 4078 4079 call cwtime(cpu_all, wall_all, gflops_all, "start") 4080 call timab(1923, 1, tsec) 4081 4082 ABI_CHECK(gwr%tchi_space == "none", sjoin("tchi_space: ", gwr%tchi_space, " != none")) 4083 gwr%tchi_space = "itau" 4084 4085 ! Allocate tchi_q(g,g') matrices 4086 mask_qibz = 0; mask_qibz(gwr%my_qibz_inds(:)) = 1 4087 call gwr%print_mem(unit=std_out) 4088 call gwr%malloc_free_mats(mask_qibz, "tchi", "malloc") 4089 4090 max_abs_imag_chit = zero 4091 4092 ! Setup FFT mesh in the supercell. 4093 sc_ngfft = gwr%g_ngfft 4094 sc_ngfft(1:3) = gwr%ngkpt * gwr%g_ngfft(1:3); sc_ngfft(4:6) = sc_ngfft(1:3) 4095 sc_nfft = product(sc_ngfft(1:3)); sc_nfftsp = sc_nfft * gwr%nspinor 4096 4097 if (gwr%use_supercell_for_tchi) then 4098 ! ============================ 4099 ! Chi algorithm with supercell 4100 ! ============================ 4101 call print_chi_header() 4102 4103 ! Be careful when using the FFT plan with ndat as ndat can change inside the loop if we start to block. 4104 ! Perhaps the safest approach would be to generate the plan on the fly. 4105 max_ndat = gwr%sc_batch_size 4106 use_mpi_for_k = gwr%sc_batch_size == gwr%kpt_comm%nproc .and. gwr%kpt_comm%nproc > 1 4107 use_mpi_for_k = .False. 4108 4109 use_shmem_for_k = gwr%sc_batch_size == gwr%kpt_comm%nproc .and. gwr%kpt_comm%nproc > 1 4110 use_shmem_for_k = use_shmem_for_k .and. gwr%kpt_comm%can_use_shmem() 4111 !use_shmem_for_k = .False. 4112 4113 if (use_shmem_for_k) then 4114 buf_count = 2 * (sc_nfftsp * max_ndat * 2) 4115 call gwr%kpt_comm%allocate_shared_master(buf_count, gwpc, xmpi_info_null, void_ptr, gt_scbox_win) 4116 call c_f_pointer(void_ptr, gt_scbox, shape=[sc_nfftsp, max_ndat, 2]) 4117 end if 4118 4119 call wrtout(std_out, sjoin(" use_mpi_for_k:", yesno(use_mpi_for_k))) 4120 call wrtout(std_out, sjoin(" use_shmem_for_k:", yesno(use_shmem_for_k))) 4121 mem_mb = (sc_nfftsp * max_ndat * 2 * gwpc) * b2Mb 4122 call wrtout(std_out, sjoin(" Memory for gt_scbox array:", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4123 4124 if (.not. use_shmem_for_k) then 4125 ABI_MALLOC(gt_scbox, (sc_nfftsp, max_ndat, 2)) 4126 end if 4127 4128 ! Build plan for dense FFTs. 4129 call green_plan%from_ngfft(sc_ngfft, gwr%nspinor*max_ndat*2, gwr%dtset%gpu_option) 4130 4131 ! The g-vectors in the supercell for G and tchi. 4132 ABI_MALLOC(green_scgvec, (3, gwr%green_mpw)) 4133 ABI_MALLOC(chi_scgvec, (3, gwr%tchi_mpw)) 4134 ABI_MALLOC(cemiqr, (gwr%g_nfft * gwr%nspinor)) ! The phase e^{-iq.r} in the unit cell. 4135 ABI_MALLOC(gt_gpr, (2, gwr%my_nkbz)) 4136 ABI_MALLOC(chiq_gpr, (gwr%my_nqibz)) 4137 ABI_MALLOC(desc_mykbz, (gwr%my_nkbz)) 4138 4139 ! Allocate PBLAS arrays for tchi_q(g',r) for all q in the IBZ treated by this MPI rank. 4140 ! Here we're gonna have a big allocation peak. 4141 do my_iqi=1,gwr%my_nqibz 4142 iq_ibz = gwr%my_qibz_inds(my_iqi) 4143 npwsp = gwr%tchi_desc_qibz(iq_ibz)%npw * gwr%nspinor 4144 ncol_glob = gwr%g_nfft * gwr%nspinor 4145 ABI_CHECK(block_dist_1d(ncol_glob, gwr%g_comm%nproc, col_bsize, msg), msg) 4146 call chiq_gpr(my_iqi)%init(npwsp, gwr%g_nfft * gwr%nspinor, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 4147 end do 4148 mem_mb = sum(slk_array_locmem_mb(chiq_gpr)) 4149 call wrtout(std_out, sjoin(" Local memory for chi_q(g',r) matrices: ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4150 4151 ! Loop over my spins and my taus. 4152 do my_is=1,gwr%my_nspins 4153 spin = gwr%my_spins(my_is) 4154 do my_it=1,gwr%my_ntau 4155 call cwtime(cpu_tau, wall_tau, gflops_tau, "start") 4156 itau = gwr%my_itaus(my_it) 4157 !if (my_it == 1 .and. gwr%comm%me == 0) call gwr%pstat%print([std_out], reload=.True.) 4158 4159 ! G_k(g,g') --> G_k(g',r) e^{ik.r} for each k in the BZ treated by me. 4160 call gwr%get_myk_green_gpr(itau, spin, desc_mykbz, gt_gpr) 4161 4162 !if (my_it == 1 .and. gwr%comm%me == 0) call gwr%pstat%print([std_out], reload=.True.) 4163 4164 ! Loop over r in the unit cell that is now MPI-distributed inside g_comm. 4165 ! This is a bottleneck but perhaps one can take advantage of localization. 4166 ! Also, one can save all the FFTs in a matrix G(mnfft * ndat, my_nkbz) multiply by the e^{-ikr} phase 4167 ! and then use zgemm to compute Out(r,L) = [e^{-ikr}G_k(r)] e^{-ikL} with precomputed e^{-iLk} phases. 4168 my_nr = gt_gpr(1,1)%sizeb_local(2) 4169 4170 do my_ir=1, my_nr, gwr%sc_batch_size 4171 ndat = blocked_loop(my_ir, my_nr, gwr%sc_batch_size) 4172 print_time = (gwr%comm%me == 0 .and. (my_ir <= 6 * gwr%sc_batch_size .or. mod(my_ir, LOG_MODR) == 0)) 4173 if (print_time) call cwtime(cpu_ir, wall_ir, gflops_ir, "start") 4174 4175 ! TODO: GPU version 4176 4177 if (.not. use_shmem_for_k) then 4178 4179 ! Insert G_k(g',r) in G'-space in the supercell FFT box (ndat vectors starting at my_ir). 4180 call gwr%gk_to_scbox(sc_ngfft, desc_mykbz, green_scgvec, my_ir, ndat, gt_gpr, gt_scbox) 4181 4182 if (.not. use_mpi_for_k) then 4183 ! G(G',r) --> G(R',r) = sum_{k,g'} e^{-i(k+g').R'} G_k(g',r) 4184 if (gwr%kpt_comm%nproc > 1) call xmpi_sum(gt_scbox, gwr%kpt_comm%value, ierr) 4185 call green_plan%execute(gt_scbox(:,1,1), -1, iscale=0) 4186 4187 ! Compute tchi(R',r) for this r and store it in (:,:,1). Note that results are real so one might use r2c FFT. 4188 ! Then back to tchi(G'=q+g',r) immediately with isign + 1. 4189 gt_scbox(:,:,1) = gt_scbox(:,:,1) * conjg(gt_scbox(:,:,2)) 4190 !max_abs_imag_chit = max(max_abs_imag_chit, maxval(abs(aimag(gt_scbox(:,:,1))))) 4191 call green_plan%execute(gt_scbox(:,1,1), +1) 4192 4193 else 4194 ! Reduce one G_k(tau) on the idat-1 proc and perform ndat FFTs in parallel. 4195 ! Finally, broadcast from the (idat-1) proc inside gwr%kpt_comm. 4196 do ipm=1,2 4197 do idat=1,ndat 4198 call xmpi_sum_master(gt_scbox(:,idat,ipm), idat-1, gwr%kpt_comm%value, ierr) 4199 end do 4200 end do 4201 idat = gwr%kpt_comm%me + 1 4202 do ipm=1,2 4203 call green_plan%execute(gt_scbox(:,idat,ipm), -1, ndat=gwr%nspinor, iscale=0) 4204 end do 4205 gt_scbox(:,idat,1) = gt_scbox(:,idat,1) * conjg(gt_scbox(:,idat,2)) 4206 call green_plan%execute(gt_scbox(:,idat,1), +1, ndat=gwr%nspinor) 4207 do idat=1,ndat 4208 call xmpi_bcast(gt_scbox(:,idat,1), idat-1, gwr%kpt_comm%value, ierr) 4209 end do 4210 end if 4211 4212 else 4213 ! use_shmem_for_k --> MPI shared window version. Only gt_scbox is shared. 4214 call gwr%gk_to_scbox(sc_ngfft, desc_mykbz, green_scgvec, my_ir, ndat, gt_gpr, gt_scbox, & 4215 gt_scbox_win=gt_scbox_win) 4216 4217 ! Now each MPI proc operates on different idat entries. 4218 !IF (.not. MPI_ASYNC_PROTECTS_NONBLOCKING) CALL MPI_F_SYNC_REG(gt_scbox) 4219 call xmpi_win_fence(gt_scbox_win) 4220 idat = gwr%kpt_comm%me + 1 4221 if (idat <= ndat) then 4222 do ipm=1,2 4223 call green_plan%execute(gt_scbox(:,idat,ipm), -1, ndat=gwr%nspinor, iscale=0) 4224 end do 4225 gt_scbox(:,idat,1) = gt_scbox(:,idat,1) * conjg(gt_scbox(:,idat,2)) 4226 call green_plan%execute(gt_scbox(:,idat,1), +1, ndat=gwr%nspinor) 4227 end if 4228 !IF (.not. MPI_ASYNC_PROTECTS_NONBLOCKING) CALL MPI_F_SYNC_REG(gt_scbox) 4229 !call xmpi_barrier(gwr%kpt_comm%value) 4230 call xmpi_win_fence(gt_scbox_win) 4231 end if 4232 4233 ! Now extract tchi_q(g',r) on the ecuteps (q+g)-sphere from the FFT box in the supercell 4234 ! and save data in chiq_gpr PBLAS matrix. Only my q-points in the IBZ are considered. 4235 ! Alternatively, one can avoid the above FFT, use zero-padded to go from the supercell 4236 ! to the ecuteps g-sphere inside the my_iqi loop. This approach should play well with k-point parallelism. 4237 do my_iqi=1,gwr%my_nqibz 4238 iq_ibz = gwr%my_qibz_inds(my_iqi); qq_ibz = gwr%qibz(:, iq_ibz); desc_q => gwr%tchi_desc_qibz(iq_ibz) 4239 gg = nint(qq_ibz * gwr%ngqpt) 4240 do ig=1,desc_q%npw 4241 chi_scgvec(:,ig) = gg + gwr%ngqpt(:) * desc_q%gvec(:,ig) ! q+g 4242 end do 4243 call box2gsph(sc_ngfft, desc_q%npw, gwr%nspinor * ndat, chi_scgvec, & 4244 gt_scbox(:,1,1), chiq_gpr(my_iqi)%buffer_cplx(:,my_ir)) 4245 ! TODO: 4246 !call desc_q%box2gsph(qq_ibz, gwr%ngqpt, sc_ngfft, gwr%nspinor * ndat, & 4247 ! gt_scbox(:,1,1), chiq_gpr(my_iqi)%buffer_cplx(:,my_ir)) 4248 end do ! my_iqi 4249 4250 if (print_time) then 4251 write(msg,'(4x,3(a,i0),a)')"Chi my_ir [", my_ir, "/", my_nr, "] (tot: ", gwr%g_nfft, ")" 4252 call cwtime_report(msg, cpu_ir, wall_ir, gflops_ir) 4253 end if 4254 end do ! my_ir (end cpu intensive loop) 4255 4256 ! Free descriptors and PBLAS matrices in kBZ. 4257 call desc_array_free(desc_mykbz) 4258 call slk_array_free(gt_gpr) 4259 4260 ! Now we have tchi_q(g',r). 4261 ! For each IBZ q-point treated by this MPI proc, do: 4262 ! 4263 ! 1) MPI transpose to have tchi_q(r,g') 4264 ! 2) FFT along the first dimension to get tchi_q(g,g') and store it in gwr%tchi_qibz 4265 ! 4266 tchi_rfact = one / gwr%g_nfft / gwr%cryst%ucvol / (gwr%nkbz * gwr%nqbz) 4267 do my_iqi=1,gwr%my_nqibz 4268 iq_ibz = gwr%my_qibz_inds(my_iqi) 4269 q_is_gamma = normv(gwr%qibz(:,iq_ibz), gwr%cryst%gmet, "G") < GW_TOLQ0 4270 desc_q => gwr%tchi_desc_qibz(iq_ibz) 4271 4272 ! Note the minus sign in q. 4273 if (.not. q_is_gamma) call calc_ceikr(-gwr%qibz(:,iq_ibz), gwr%g_ngfft, gwr%g_nfft, gwr%nspinor, cemiqr) 4274 4275 ! MPI-transposition: tchi_q(g',r) => tchi_q(r,g') 4276 call chiq_gpr(my_iqi)%ptrans("N", chi_rgp) 4277 4278 ! FFT tchi_q(r,g') --> tchi_q(g,g'). Results stored in gwr%tchi_qibz. 4279 call uplan_q%init(desc_q%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, istwfk1, & 4280 desc_q%gvec, gwpc, gwr%dtset%gpu_option) 4281 4282 do ig2=1, chi_rgp%sizeb_local(2), gwr%uc_batch_size 4283 ndat = blocked_loop(ig2, chi_rgp%sizeb_local(2), gwr%uc_batch_size) 4284 4285 if (.not. q_is_gamma) then 4286 !$OMP PARALLEL DO 4287 do idat=0,ndat-1 4288 chi_rgp%buffer_cplx(:,ig2+idat) = cemiqr(:) * chi_rgp%buffer_cplx(:,ig2+idat) 4289 end do 4290 end if 4291 4292 call uplan_q%execute_rg(ndat, chi_rgp%buffer_cplx(:, ig2), & 4293 gwr%tchi_qibz(iq_ibz, itau, spin)%buffer_cplx(:, ig2)) 4294 4295 !$OMP PARALLEL DO 4296 do idat=0,ndat-1 4297 gwr%tchi_qibz(iq_ibz, itau, spin)%buffer_cplx(:, ig2 + idat) = & 4298 gwr%tchi_qibz(iq_ibz, itau, spin)%buffer_cplx(:, ig2 + idat) * tchi_rfact 4299 end do 4300 !call gwr%tchi_qibz(iq_ibz, itau, spin)%scale_rows(ig2, ndat, tchi_rfact) 4301 !call xscal(npwsp, real(sqrt(gt_rfact), kind=gwpc), work_gb%buffer_cplx(:,il_b), 1) 4302 end do ! ig2 4303 4304 call uplan_q%free() 4305 call chi_rgp%free() 4306 4307 call gwr%tchi_qibz(iq_ibz, itau, spin)%set_imag_diago_to_zero(local_max) 4308 end do ! my_iqi 4309 4310 write(msg,'(3(a,i0),a)')" My itau [", my_it, "/", gwr%my_ntau, "] (tot: ", gwr%ntau, ")" 4311 call cwtime_report(msg, cpu_tau, wall_tau, gflops_tau, end_str=ch10) 4312 end do ! my_it 4313 end do ! my_is 4314 4315 if (use_shmem_for_k) then 4316 call xmpi_win_free(gt_scbox_win) 4317 else 4318 ABI_FREE(gt_scbox) 4319 end if 4320 4321 ABI_FREE(green_scgvec) 4322 ABI_FREE(chi_scgvec) 4323 ABI_FREE(cemiqr) 4324 ABI_FREE(gt_gpr) 4325 ABI_FREE(desc_mykbz) 4326 call slk_array_free(chiq_gpr) 4327 ABI_FREE(chiq_gpr) 4328 call green_plan%free() 4329 4330 else ! not gwr%use_supercell_for_tchi 4331 ! =================================================================== 4332 ! Mixed-space algorithm in the unit cell with convolutions in k-space 4333 ! =================================================================== 4334 call print_chi_header() 4335 4336 ! Allocate memory for G_k(r',r) and chi_q(r',r) 4337 ! Need all nqibz matrices here as the iq_ibz loop is the innermost one unlike in the legacy GW code. 4338 nrsp = gwr%g_nfft * gwr%nspinor 4339 col_bsize = nrsp / gwr%g_comm%nproc; if (mod(nrsp, gwr%g_comm%nproc) /= 0) col_bsize = col_bsize + 1 4340 ABI_MALLOC(chiq_rpr, (gwr%nqibz)) 4341 do iq_ibz=1,gwr%nqibz 4342 call chiq_rpr(iq_ibz)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 4343 end do 4344 do ipm=1,2 4345 ! TODO: Can save memory here as we don't need +/- tau for each k/k+q 4346 call gk_rpr_pm(ipm)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 4347 call gkq_rpr_pm(ipm)%init(nrsp, nrsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 4348 end do 4349 4350 mem_mb = sum(slk_array_locmem_mb(chiq_rpr)) + sum(slk_array_locmem_mb(gk_rpr_pm)) + sum(slk_array_locmem_mb(gkq_rpr_pm)) 4351 call wrtout(std_out, sjoin(" Local memory for chi_q(r',r) (gt_gpr): ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4352 4353 ! * The little group is needed when symchi == 1 4354 ! * If use_umklp == 1 then symmetries requiring an umklapp to preserve qibz are included as well. 4355 ! * Note that TR is not yet supported so timrev is set to 1 even if TR has been used to generate the GS IBZ. 4356 ABI_MALLOC(ltg_qibz, (gwr%nqibz)) 4357 use_umklp = 1 4358 do iq_ibz=1,gwr%nqibz 4359 call ltg_qibz(iq_ibz)%init(gwr%qibz(:,iq_ibz), gwr%nkbz, gwr%kbz, gwr%cryst, use_umklp, npwe=0, timrev=1) 4360 !call ltg_qibz(iq_ibz)%print(unit=std_out, prtvol=gwr%dtset%prtvol) 4361 end do 4362 4363 need_kibz = 0 4364 do my_ikf=1,gwr%my_nkbz 4365 ik_bz = gwr%my_kbz_inds(my_ikf); kk_bz = gwr%kbz(:, ik_bz) 4366 do iq_ibz=1,gwr%nqibz 4367 qq_ibz = gwr%qibz(:, iq_ibz) 4368 kpq_bz = kk_bz + qq_ibz 4369 !kpq_bz = qq_ibz - kk_bz 4370 !kpq_bz = kk_bz - qq_ibz 4371 ! TODO: here I may need to take into account the umklapp 4372 call findqg0(ikq_bz, g0_kq, kpq_bz, gwr%nkbz, gwr%kbz, gwr%mG0) 4373 !ABI_CHECK(all(g0_kq == 0), sjoin("g0_kq != 0, kk_bz", ktoa(kpq_bz), "qq_ibz:", ktoa(qq_ibz))) 4374 ikq_ibz = gwr%kbz2ibz(1,ikq_bz) 4375 need_kibz(ikq_ibz) = 1 4376 end do 4377 end do 4378 4379 do my_is=1,gwr%my_nspins 4380 spin = gwr%my_spins(my_is) 4381 do my_it=1,gwr%my_ntau 4382 call cwtime(cpu_tau, wall_tau, gflops_tau, "start") 4383 itau = gwr%my_itaus(my_it) 4384 4385 ! Redistribute G_k(g,g') with k in the IBZ so that each MPI proc 4386 ! can reconstruct G_{k+q} in the BZ inside the MPI-distributed loops. 4387 ! TODO: support for ipm_list else we have a memory leak. 4388 call gwr%redistrib_gt_kibz(itau, spin, need_kibz, got_kibz, "communicate") !ipm_list= 4389 4390 ! Sum over my k-points in the BZ. 4391 call slk_array_set(chiq_rpr, czero) 4392 4393 do my_ikf=1,gwr%my_nkbz 4394 print_time = gwr%comm%me == 0 .and. (my_ikf <= LOG_MODK .or. mod(my_ikf, LOG_MODK) == 0) 4395 if (print_time) call cwtime(cpu_ikf, wall_ikf, gflops_ikf, "start") 4396 ik_bz = gwr%my_kbz_inds(my_ikf); kk_bz = gwr%kbz(:, ik_bz) 4397 4398 !ik_ibz = gwr%kbz2ibz(1, ik_bz); isym_k = gwr%kbz2ibz(2, ik_bz) 4399 !trev_k = gwr%kbz2ibz(6, ik_bz); g0_k = gwr%kbz2ibz(3:5, ik_bz) 4400 !isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 4401 !tsign_k = merge(1, -1, trev_k == 0) 4402 !if (.not. isirr_k) cycle 4403 4404 ! Use symmetries to get G_kbz(g,g') from the IBZ, then G_kbz(g,g') -> G_kbz(r',r). 4405 ! TODO: here I may need to take into account the umklapp 4406 call gwr%get_gkbz_rpr_pm(ik_bz, itau, spin, gk_rpr_pm, ipm_list=[1]) ! g0=?? 4407 4408 do iq_ibz=1,gwr%nqibz 4409 if (gwr%dtset%symchi /= 0 .and. ltg_qibz(iq_ibz)%ibzq(ik_bz) == 0) cycle ! FIXME: iq_bz or ikq? 4410 qq_ibz = gwr%qibz(:,iq_ibz) 4411 kpq_bz = kk_bz + qq_ibz 4412 !kpq_bz = qq_ibz - kk_bz 4413 4414 ! TODO: here I may need to take into account the umklapp if k+q is outside the BZ. 4415 call findqg0(ikq_bz, g0_kq, kpq_bz, gwr%nkbz, gwr%kbz, gwr%mG0) 4416 !ABI_CHECK(all(g0_kq == 0), sjoin("g0_kq != 0, kk_bz", ktoa(kpq_bz), "qq_ibz:", ktoa(qq_ibz))) 4417 4418 ! Use symmetries to get G_kqbz(g,g') from the IBZ, then G_kqbz(g,g') -> G_kqbz(r',r). 4419 ! Also, we don't need G(+/-t) for both k, k+q wavevectors. 4420 call gwr%get_gkbz_rpr_pm(ikq_bz, itau, spin, gkq_rpr_pm, g0=g0_kq, ipm_list=[2]) 4421 4422 ! The weight depends on q_ibz and the symmetries of the little group of qq_ibz. 4423 wtqp = one / gwr%nkbz; wtqm = zero 4424 if (gwr%dtset%symchi /= 0) then 4425 wtqp = (one * sum(ltg_qibz(iq_ibz)%wtksym(1,:,ik_bz))) / gwr%nkbz 4426 wtqm = (one * sum(ltg_qibz(iq_ibz)%wtksym(2,:,ik_bz))) / gwr%nkbz 4427 ABI_CHECK(wtqm == zero, sjoin("TR is not yet implemented:, wqtm:", ftoa(wtqm))) 4428 end if 4429 4430 chiq_rpr(iq_ibz)%buffer_cplx = chiq_rpr(iq_ibz)%buffer_cplx + & 4431 !wtqp * gkq_rpr_pm(1)%buffer_cplx * conjg(gk_rpr_pm(2)%buffer_cplx) ! This should be OK 4432 wtqp * gk_rpr_pm(1)%buffer_cplx * conjg(gkq_rpr_pm(2)%buffer_cplx) ! RECHECK EQ. This one works 4433 ! but requires ptrans with C 4434 end do ! my_iqi 4435 4436 if (print_time) then 4437 write(msg,'(4x,3(a,i0),a)')"Chi my_ikf [", my_ikf, "/", gwr%my_nkbz, "] (tot: ", gwr%nkbz, ")" 4438 call cwtime_report(msg, cpu_ikf, wall_ikf, gflops_ikf) 4439 end if 4440 end do ! my_ikf 4441 4442 ! Deallocate got_kibz Green's functions. 4443 call gwr%redistrib_gt_kibz(itau, spin, need_kibz, got_kibz, "free") 4444 4445 ! From chi_q(r',r) to chi_q(g,g') for each q in the IBZ. 4446 do iq_ibz=1,gwr%nqibz 4447 call xmpi_sum(chiq_rpr(iq_ibz)%buffer_cplx, gwr%kpt_comm%value, ierr) 4448 end do 4449 4450 do iq_ibz=1,gwr%nqibz 4451 if (.not. any(iq_ibz == gwr%my_qibz_inds)) cycle 4452 ! TODO: Recheck API and scaling factor. 4453 call gwr_rpr_to_ggp(gwr, gwr%tchi_desc_qibz(iq_ibz), chiq_rpr(iq_ibz), gwr%tchi_qibz(iq_ibz,itau,spin)) 4454 tchi_rfact = one / gwr%cryst%ucvol !/ (gwr%nkbz * gwr%nqbz) 4455 gwr%tchi_qibz(iq_ibz,itau,spin)%buffer_cplx = gwr%tchi_qibz(iq_ibz,itau,spin)%buffer_cplx * tchi_rfact 4456 end do ! my_iqi 4457 4458 write(msg,'(3(a,i0),a)')" My itau [", my_it, "/", gwr%my_ntau, "] (tot: ", gwr%ntau, ")" 4459 call cwtime_report(msg, cpu_tau, wall_tau, gflops_tau) 4460 end do ! my_it 4461 end do ! spin 4462 4463 ! Free memory 4464 call slk_array_free(gk_rpr_pm); call slk_array_free(gkq_rpr_pm); call slk_array_free(chiq_rpr) 4465 ABI_FREE(chiq_rpr) 4466 do iq_ibz=1,gwr%nqibz 4467 call ltg_qibz(iq_ibz)%free() 4468 end do 4469 ABI_FREE(ltg_qibz) 4470 !call wrtout(std_out, " Mixed space algorithm for chi completed") 4471 end if 4472 4473 !call wrtout(std_out, sjoin(" max_abs_imag_chit", ftoa(max_abs_imag_chit))) 4474 4475 ! Print trace of chi_q(i tau) matrices for testing purposes. 4476 if (gwr%dtset%prtvol > 0) call gwr%print_trace("tchi_qibz") 4477 4478 ! Transform irreducible tchi from imaginary tau to imaginary omega. 4479 ! Also sum over spins to get total tchi if collinear spin. 4480 call gwr%cos_transform("tchi", "it2w", sum_spins=.True.) 4481 4482 if (gwr%kpt_comm%me == 0) then 4483 ! =================================================== 4484 ! ==== Construct head and wings from the tensor ===== 4485 ! =================================================== 4486 associate (desc_q0 => gwr%tchi_desc_qibz(1), mat_ts => gwr%tchi_qibz(1,:,:)) 4487 ABI_CHECK_IEQ(desc_q0%ig0, 1, "ig0 should be 1") 4488 ABI_MALLOC(up_wing_q, (desc_q0%npw)) 4489 ABI_MALLOC(low_wing_q, (desc_q0%npw)) 4490 4491 do my_is=1,gwr%my_nspins 4492 spin = gwr%my_spins(my_is) 4493 do my_it=1,gwr%my_ntau 4494 itau = gwr%my_itaus(my_it) 4495 4496 do ig=2,desc_q0%npw 4497 wng = gwr%chi0_uwing_myw(:,ig, my_it) 4498 up_wing_q(ig) = vdotw(gwr%q0, wng, gwr%cryst%gmet, "G") 4499 wng = gwr%chi0_lwing_myw(:,ig,my_it) 4500 low_wing_q(ig) = vdotw(gwr%q0, wng, gwr%cryst%gmet, "G") 4501 end do 4502 chq = matmul(gwr%chi0_head_myw(:,:,my_it), gwr%q0) 4503 head_q = vdotw(gwr%q0, chq, gwr%cryst%gmet, "G") 4504 4505 call mat_ts(itau, spin)%set_head_and_wings(head_q, low_wing_q, up_wing_q) 4506 end do ! my_it 4507 end do ! my_is 4508 end associate 4509 ABI_FREE(up_wing_q) 4510 ABI_FREE(low_wing_q) 4511 end if 4512 4513 ! Print trace of chi_q(i omega) matrices for testing purposes. 4514 if (gwr%dtset%prtvol > 0) call gwr%print_trace("tchi_qibz") 4515 4516 ! Write file with chi0(i omega) if asked by user. 4517 if (gwr%dtset%prtsuscep > 0) call gwr%ncwrite_tchi_wc("tchi", trim(gwr%dtfil%filnam_ds(4))//'_TCHIM.nc') 4518 4519 call cwtime_report(" gwr_build_tchi:", cpu_all, wall_all, gflops_all) 4520 call timab(1923, 2, tsec) 4521 4522 contains 4523 4524 subroutine print_chi_header() 4525 if (gwr%comm%me /= 0) return 4526 if (gwr%use_supercell_for_tchi) then 4527 call wrtout(std_out, " Building chi0 in the supercell with FFTs ", pre_newlines=2) 4528 else 4529 call wrtout(std_out, " Building chi_q(r,r') with convolutions in k-space:", pre_newlines=2) 4530 end if 4531 call wrtout(std_out, sjoin(" gwr_np_kgts:", ltoa(gwr%dtset%gwr_np_kgts))) 4532 call wrtout(std_out, sjoin(" ngkpt:", ltoa(gwr%ngkpt), ", ngqpt:", ltoa(gwr%ngqpt))) 4533 call wrtout(std_out, sjoin(" gwr_boxcutmin:", ftoa(gwr%dtset%gwr_boxcutmin))) 4534 call wrtout(std_out, sjoin(" sc_ngfft:", ltoa(sc_ngfft(1:8)))) 4535 call wrtout(std_out, sjoin(" my_ntau:", itoa(gwr%my_ntau), ", ntau:", itoa(gwr%ntau))) 4536 call wrtout(std_out, sjoin(" my_nkbz:", itoa(gwr%my_nkbz), ", nkbz:", itoa(gwr%nkbz))) 4537 call wrtout(std_out, sjoin(" my_nkibz:", itoa(gwr%my_nkibz), ", nkibz:", itoa(gwr%nkibz))) 4538 call wrtout(std_out, sjoin("- FFT uc_batch_size:", itoa(gwr%uc_batch_size))) 4539 call wrtout(std_out, sjoin("- FFT sc_batch_size:", itoa(gwr%sc_batch_size)), do_flush=.True.) 4540 end subroutine print_chi_header 4541 4542 end subroutine gwr_build_tchi
m_gwr/gwr_build_wc [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_build_wc
FUNCTION
Compute Wc(i tau,g,g') from tchi(i omega,g,g')
INPUTS
OUTPUT
SOURCE
4892 subroutine gwr_build_wc(gwr) 4893 4894 !Arguments ------------------------------------ 4895 class(gwr_t),target,intent(inout) :: gwr 4896 4897 #ifndef FC_CRAY 4898 !Local variables------------------------------- 4899 !scalars 4900 integer,parameter :: master = 0 4901 integer :: my_iqi, my_it, my_is, iq_ibz, spin, itau, iw, ierr 4902 integer :: il_g1, il_g2, ig1, ig2, iglob1, iglob2, ig0 4903 real(dp) :: cpu_all, wall_all, gflops_all, cpu_q, wall_q, gflops_q 4904 logical :: q_is_gamma, free_tchi, print_time 4905 character(len=5000) :: msg 4906 complex(dpc) :: vcs_g1, vcs_g2 4907 type(__slkmat_t) :: em1 4908 type(yamldoc_t) :: ydoc 4909 !arrays 4910 real(dp) :: qq_ibz(3), tsec(2) 4911 complex(dpc) :: em1_wq(gwr%ntau, gwr%nqibz), eps_wq(gwr%ntau, gwr%nqibz) 4912 4913 ! ************************************************************************* 4914 4915 call cwtime(cpu_all, wall_all, gflops_all, "start") 4916 call timab(1924, 1, tsec) 4917 call wrtout([std_out, ab_out], " Building correlated screening Wc ...", pre_newlines=2) 4918 ABI_CHECK(gwr%tchi_space == "iomega", sjoin("tchi_space: ", gwr%tchi_space, " != iomega")) 4919 4920 if (allocated(gwr%wc_qibz)) then 4921 call slk_array_free(gwr%wc_qibz) 4922 ABI_FREE(gwr%wc_qibz) 4923 gwr%wc_space = "none" 4924 end if 4925 4926 ABI_CHECK(gwr%wc_space == "none", sjoin("wc_space: ", gwr%wc_space, " != none")) 4927 gwr%wc_space = "iomega" 4928 4929 ! ======================================= 4930 ! Allocate PBLAS arrays for wc_qibz(g,g') 4931 ! ======================================= 4932 ! Note that we have already summed tchi over spin. 4933 ! Also, G=0 corresponds to iglob = 1 as only q-points in the IBZ are treated here. 4934 ! This is not true for the other q-points in the full BZ as we may have a non-zero umklapp g0_q. 4935 ABI_MALLOC(gwr%wc_qibz, (gwr%nqibz, gwr%ntau, gwr%nsppol)) 4936 4937 free_tchi = .True.; if (free_tchi) gwr%tchi_space = "none" 4938 em1_wq = zero; eps_wq = zero 4939 4940 ! If possible, use 2d rectangular grid of processors for diagonalization. 4941 !call slkproc_4diag%init(gwr%g_comm%value) 4942 4943 do my_iqi=1,gwr%my_nqibz 4944 print_time = gwr%comm%me == 0 .and. (my_iqi <= LOG_MODK .or. mod(my_iqi, LOG_MODK) == 0) 4945 if (print_time) call cwtime(cpu_q, wall_q, gflops_q, "start") 4946 iq_ibz = gwr%my_qibz_inds(my_iqi); qq_ibz = gwr%qibz(:, iq_ibz) 4947 q_is_gamma = normv(qq_ibz, gwr%cryst%gmet, "G") < GW_TOLQ0 4948 4949 associate (desc_q => gwr%tchi_desc_qibz(iq_ibz)) 4950 ig0 = desc_q%ig0 4951 4952 ! The spin loop is needed so that procs in different pools can operate 4953 ! on their own matrix that has been already summed over (collinear) spins. 4954 do my_is=1,gwr%my_nspins 4955 spin = gwr%my_spins(my_is) 4956 do my_it=1,gwr%my_ntau 4957 itau = gwr%my_itaus(my_it) 4958 4959 ! Build symmetrized RPA epsilon: 1 - Vc^{1/2} chi0 Vc^{1/2} 4960 associate (wc => gwr%wc_qibz(iq_ibz, itau, spin)) 4961 call gwr%tchi_qibz(iq_ibz, itau, spin)%copy(wc) 4962 if (free_tchi) call gwr%tchi_qibz(iq_ibz, itau, spin)%free() 4963 4964 do il_g2=1,wc%sizeb_local(2) 4965 iglob2 = wc%loc2gcol(il_g2) 4966 ig2 = mod(iglob2 - 1, desc_q%npw) + 1 4967 vcs_g2 = desc_q%vc_sqrt(ig2) 4968 do il_g1=1,wc%sizeb_local(1) 4969 iglob1 = wc%loc2grow(il_g1) 4970 ig1 = mod(iglob1 - 1, desc_q%npw) + 1 4971 vcs_g1 = desc_q%vc_sqrt(ig1) 4972 wc%buffer_cplx(il_g1, il_g2) = -wc%buffer_cplx(il_g1, il_g2) * vcs_g1 * vcs_g2 4973 if (iglob1 == iglob2) then 4974 wc%buffer_cplx(il_g1, il_g2) = one + wc%buffer_cplx(il_g1, il_g2) 4975 if (iglob1 == ig0 .and. iglob2 == ig0) then 4976 ! Store epsilon_{iw, iq_ibz}(0, 0) 4977 ! Rescale by np_qibz because we will MPI reduce this array. 4978 eps_wq(itau, iq_ibz) = wc%buffer_cplx(il_g1, il_g2) / gwr%np_qibz(iq_ibz) 4979 end if 4980 end if 4981 end do ! il_g1 4982 end do ! il_g2 4983 4984 ! Invert symmetrized epsilon. 4985 ! NB: PZGETRF requires square block cyclic decomposition along the two axes 4986 ! hence we have to redistribute the data before calling invert. 4987 4988 call wc%change_size_blocs(em1) ! processor=slkproc_4diag 4989 !call em1%invert() 4990 call em1%hpd_invert("U") ! TODO: Can use hpd_invert 4991 call wc%take_from(em1, free=.True.) ! processor=wc%processor) 4992 4993 !call wrtout(std_out, sjoin(" e-1 at q:", ktoa(qq_ibz), "i omega:", ftoa(gwr%iw_mesh(itau) * Ha_eV), "eV")) 4994 !call print_arr(wc%buffer_cplx, unit=std_out) 4995 4996 ! Build Wc(q, iw) = e^{-1}_q(g,g',iw) - delta_{gg'} v_q(g,g') by removing bare vc 4997 do il_g2=1,wc%sizeb_local(2) 4998 iglob2 = wc%loc2gcol(il_g2) 4999 ig2 = mod(iglob2 - 1, desc_q%npw) + 1 5000 vcs_g2 = desc_q%vc_sqrt(ig2) 5001 do il_g1=1,wc%sizeb_local(1) 5002 iglob1 = wc%loc2grow(il_g1) 5003 ig1 = mod(iglob1 - 1, desc_q%npw) + 1 5004 vcs_g1 = desc_q%vc_sqrt(ig1) 5005 5006 if (iglob1 == ig0 .and. iglob2 == ig0) then 5007 ! Store epsilon^{-1}_{iw, iq_ibz}(0, 0). Rescale by np_qibz because we will MPI reduce this array. 5008 em1_wq(itau, iq_ibz) = wc%buffer_cplx(il_g1, il_g2) / gwr%np_qibz(iq_ibz) 5009 end if 5010 5011 ! Subtract exchange part. 5012 if (iglob1 == iglob2) wc%buffer_cplx(il_g1, il_g2) = wc%buffer_cplx(il_g1, il_g2) - one 5013 5014 ! Handle divergence in Wc for q --> 0 5015 if (q_is_gamma .and. (iglob1 == ig0 .or. iglob2 == ig0)) then 5016 if (iglob1 == ig0 .and. iglob2 == ig0) then 5017 vcs_g1 = sqrt(gwr%vcgen%i_sz); vcs_g2 = sqrt(gwr%vcgen%i_sz) 5018 else if (iglob1 == ig0) then 5019 !vcs_g1 = (four_pi) ** (three/two) * q0sph ** 2 / two 5020 vcs_g1 = sqrt(gwr%vcgen%i_sz) 5021 else if (iglob2 == ig0) then 5022 !vcs_g2 = (four_pi) ** (three/two) * q0sph ** 2 / two 5023 vcs_g2 = sqrt(gwr%vcgen%i_sz) 5024 end if 5025 end if 5026 5027 wc%buffer_cplx(il_g1, il_g2) = wc%buffer_cplx(il_g1, il_g2) * vcs_g1 * vcs_g2 / gwr%cryst%ucvol 5028 end do ! il_g1 5029 end do ! il_g2 5030 end associate 5031 5032 end do ! my_it 5033 end do ! my_is 5034 end associate 5035 5036 if (print_time) then 5037 write(msg,'(4x,2(a,i0),a)')"My iqi [", my_iqi, "/", gwr%my_nqibz, "]" 5038 call cwtime_report(msg, cpu_q, wall_q, gflops_q) 5039 end if 5040 end do ! my_iqi 5041 5042 !call slkproc_4diag%free() 5043 5044 call xmpi_sum_master(em1_wq, master, gwr%kgt_comm%value, ierr) 5045 call xmpi_sum_master(eps_wq, master, gwr%kgt_comm%value, ierr) 5046 5047 if (gwr%comm%me == master) then 5048 ! Print results to ab_out for testing purposes. 5049 ydoc = yamldoc_open('EMACRO_WITHOUT_LOCAL_FIELDS') !, width=11, real_fmt='(3f8.3)') 5050 call ydoc%open_tabular("epsilon_{iw, q -> Gamma}(0,0)") ! comment="(iomega, iq_ibz)") 5051 do iw=1,gwr%ntau 5052 write(msg, "(3(es16.8,2x))") gwr%iw_mesh(iw), real(eps_wq(iw, 1)), aimag(eps_wq(iw, 1)) 5053 call ydoc%add_tabular_line(msg) 5054 end do 5055 call ydoc%write_units_and_free([ab_out, std_out]) 5056 5057 ydoc = yamldoc_open('EMACRO_WITH_LOCAL_FIELDS') !, width=11, real_fmt='(3f8.3)') 5058 call ydoc%open_tabular("epsilon_{iw, q -> Gamma}(0,0)") !, comment="(iomega, iq_ibz)") 5059 do iw=1,gwr%ntau 5060 write(msg, "(3(es16.8,2x))") gwr%iw_mesh(iw), real(em1_wq(iw, 1)), aimag(em1_wq(iw, 1)) 5061 call ydoc%add_tabular_line(msg) 5062 end do 5063 call ydoc%write_units_and_free([ab_out, std_out]) 5064 end if 5065 5066 ! Print trace of wc_q(itau) matrices for testing purposes. 5067 if (gwr%dtset%prtvol > 0) call gwr%print_trace("wc_qibz") 5068 5069 ! Write file with Wc(i omega) 5070 !if (gwr%dtset%prtsuscep > 0) call gwr%ncwrite_tchi_wc("wc", trim(gwr%dtfil%filnam_ds(4))//'_WCIMW.nc') 5071 5072 ! Cosine transform from iomega to itau to get Wc(i tau) 5073 call gwr%cos_transform("wc", "iw2t") 5074 5075 ! Write file with Wc(i tau) 5076 !if (gwr%dtset%prtsuscep > 0) call gwr%ncwrite_tchi_wc("wc", trim(gwr%dtfil%filnam_ds(4))//'_WCIMT.nc') 5077 5078 ! Print trace of wc_q(iomega) matrices for testing purposes. 5079 !if (gwr%dtset%prtvol > 0) call gwr%print_trace("wc_qibz") 5080 5081 call cwtime_report(" gwr_build_wc:", cpu_all, wall_all, gflops_all) 5082 call timab(1924, 2, tsec) 5083 5084 #endif 5085 end subroutine gwr_build_wc
m_gwr/gwr_cos_transform [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_cos_transform
FUNCTION
Perform cosine transform.
INPUTS
SOURCE
3496 subroutine gwr_cos_transform(gwr, what, mode, sum_spins) 3497 3498 !Arguments ------------------------------------ 3499 class(gwr_t),target,intent(inout) :: gwr 3500 character(len=*),intent(in) :: what, mode 3501 logical,optional,intent(in) :: sum_spins 3502 3503 !Local variables------------------------------- 3504 !scalars 3505 integer :: my_iqi, my_is, ig1, ig2, my_it, ierr, iq_ibz, itau, spin, it0, iw 3506 integer :: ndat, idat, loc1_size, loc2_size, batch_size 3507 real(dp) :: cpu, wall, gflops 3508 logical :: sum_spins_ 3509 !arrays 3510 integer,allocatable :: requests(:) 3511 real(dp), contiguous, pointer :: weights_ptr(:,:) 3512 complex(dp) :: wgt_globmy(gwr%ntau, gwr%my_ntau) ! Complex instead of real to be able to call ZGEMM. 3513 complex(dp),allocatable :: cwork_myit(:,:,:), glob_cwork(:,:,:) 3514 type(__slkmat_t), pointer :: mats(:) 3515 3516 ! ************************************************************************* 3517 3518 call cwtime(cpu, wall, gflops, "start") 3519 sum_spins_ = .False.; if (present(sum_spins)) sum_spins_ = sum_spins 3520 3521 call wrtout(std_out, sjoin(" Performing cosine transform. what:", what, ", mode:", mode)) 3522 3523 ! Target weights depending on mode. 3524 select case(mode) 3525 case ("iw2t") 3526 ! From omega to tau 3527 if (what == "tchi") then 3528 ABI_CHECK(gwr%tchi_space == "iomega", sjoin("mode:", mode, "with what:", what, "and tchi_space:", gwr%tchi_space)) 3529 gwr%tchi_space = "itau" 3530 end if 3531 if (what == "wc") then 3532 ABI_CHECK(gwr%wc_space == "iomega", sjoin("mode:", mode, "with what:", what, "and wc_space:", gwr%wc_space)) 3533 gwr%wc_space = "itau" 3534 end if 3535 weights_ptr => gwr%cosft_tw 3536 3537 case ("it2w") 3538 ! From tau to omega 3539 if (what == "tchi") then 3540 ABI_CHECK(gwr%tchi_space == "itau", sjoin("mode:", mode, " with what:", what, "and tchi_space:", gwr%tchi_space)) 3541 gwr%tchi_space = "iomega" 3542 end if 3543 if (what == "wc") then 3544 ABI_CHECK(gwr%wc_space == "itau", sjoin("mode:", mode, " with what:", what, "and wc_space:", gwr%wc_space)) 3545 gwr%wc_space = "iomega" 3546 end if 3547 weights_ptr => gwr%cosft_wt 3548 3549 case default 3550 ABI_ERROR(sjoin("Wrong mode:", mode)) 3551 end select 3552 3553 ! Extract my weights from global array. 3554 do my_it=1,gwr%my_ntau 3555 itau = gwr%my_itaus(my_it) 3556 do iw=1,gwr%ntau 3557 wgt_globmy(iw, my_it) = weights_ptr(iw, itau) 3558 end do 3559 end do 3560 3561 ! Perform inhomogeneous FT in parallel. 3562 do my_is=1,gwr%my_nspins 3563 spin = gwr%my_spins(my_is) 3564 do my_iqi=1,gwr%my_nqibz 3565 iq_ibz = gwr%my_qibz_inds(my_iqi) 3566 associate (desc_q => gwr%tchi_desc_qibz(iq_ibz)) 3567 3568 mats => null() 3569 if (what == "tchi") mats => gwr%tchi_qibz(iq_ibz, :, spin) 3570 if (what =="wc") mats => gwr%wc_qibz(iq_ibz, :, spin) 3571 ABI_CHECK(associated(mats), sjoin("Invalid value for what:", what)) 3572 3573 ! Use the first itau index to get the size of the local buffer. 3574 ! Block over ig2 to reduce the number of MPI calls and take advantage of ZGEMM. 3575 it0 = gwr%my_itaus(1) 3576 loc1_size = mats(it0)%sizeb_local(1) 3577 loc2_size = mats(it0)%sizeb_local(2) 3578 3579 ! batch_size in terms of columns 3580 ! TODO: Determine batch_size automatically to avoid going OOM 3581 !batch_size = 1 3582 !batch_size = 24 3583 batch_size = 48 3584 !batch_size = loc2_size 3585 3586 ABI_MALLOC(cwork_myit, (gwr%my_ntau, loc1_size, batch_size)) 3587 ABI_MALLOC(glob_cwork, (gwr%ntau, loc1_size, batch_size)) 3588 ABI_MALLOC(requests, (batch_size)) 3589 3590 do ig2=1,mats(it0)%sizeb_local(2), batch_size 3591 ndat = blocked_loop(ig2, mats(it0)%sizeb_local(2), batch_size) 3592 3593 ! Extract matrix elements as a function of tau. 3594 do idat=1,ndat 3595 do my_it=1,gwr%my_ntau 3596 itau = gwr%my_itaus(my_it) 3597 do ig1=1,mats(it0)%sizeb_local(1) 3598 cwork_myit(my_it, ig1, idat) = mats(itau)%buffer_cplx(ig1, ig2+idat-1) 3599 end do 3600 end do 3601 end do 3602 3603 ! Compute contribution to itau matrix 3604 do idat=1,ndat 3605 call ZGEMM("N", "N", gwr%ntau, loc1_size, gwr%my_ntau, cone, & 3606 wgt_globmy, gwr%ntau, cwork_myit(1,1,idat), gwr%my_ntau, czero, glob_cwork(1,1,idat), gwr%ntau) 3607 !call xmpi_isum_ip(glob_cwork(:,:,idat), gwr%tau_comm%value, requests(idat), ierr) 3608 end do 3609 3610 !call xmpi_waitall_1d(requests(1:ndat), ierr) 3611 call xmpi_sum(glob_cwork, gwr%tau_comm%value, ierr) 3612 3613 ! Update my local (g1, g2) entry to have it in imaginary-frequency. 3614 !!!$OMP PARALLEL DO PRIVATE(itau) 3615 do idat=1,ndat 3616 do my_it=1,gwr%my_ntau 3617 itau = gwr%my_itaus(my_it) 3618 do ig1=1,mats(it0)%sizeb_local(1) 3619 mats(itau)%buffer_cplx(ig1, ig2+idat-1) = glob_cwork(itau, ig1, idat) 3620 end do 3621 end do 3622 end do 3623 3624 end do ! ig2 3625 3626 ABI_FREE(cwork_myit) 3627 ABI_FREE(glob_cwork) 3628 ABI_FREE(requests) 3629 end associate 3630 end do ! my_iqi 3631 end do ! my_is 3632 3633 if (sum_spins_ .and. gwr%nspinor /= 2) then ! gwr%nsppol == 2 .and. 3634 ! Sum over spins 3635 do my_iqi=1,gwr%my_nqibz 3636 iq_ibz = gwr%my_qibz_inds(my_iqi) 3637 do my_is=1,gwr%my_nspins 3638 spin = gwr%my_spins(my_is) 3639 mats => null() 3640 if (what == "tchi") mats => gwr%tchi_qibz(iq_ibz,:,spin) 3641 !if (what =="wc") mats => gwr%wc_qibz(iq_ibz, :, spin) 3642 ABI_CHECK(associated(mats), sjoin("Invalid value for what:", what)) 3643 3644 do my_it=1,gwr%my_ntau 3645 itau = gwr%my_itaus(my_it) 3646 3647 if (gwr%nsppol == 1) then 3648 mats(itau)%buffer_cplx = two * mats(itau)%buffer_cplx 3649 3650 else if (gwr%nsppol == 2) then 3651 if (gwr%spin_comm%nproc > 1) then 3652 ! Spins are distributed thus we have to sum them. 3653 call xmpi_sum(mats(itau)%buffer_cplx, gwr%spin_comm%value, ierr) 3654 else 3655 ! Spins are not distributed. This should happen only in sequential. 3656 if (spin == 1) then 3657 mats(itau)%buffer_cplx = mats(itau)%buffer_cplx + gwr%tchi_qibz(iq_ibz,itau,spin+1)%buffer_cplx 3658 gwr%tchi_qibz(iq_ibz,itau,spin+1)%buffer_cplx = mats(itau)%buffer_cplx 3659 end if 3660 end if 3661 end if 3662 3663 end do ! my_it 3664 end do ! my_is 3665 end do ! my_iqi 3666 end if 3667 3668 call cwtime_report(" gwr_cos_transform:", cpu, wall, gflops) 3669 3670 end subroutine gwr_cos_transform
m_gwr/gwr_free [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_free
FUNCTION
Free dynamic memory
SOURCE
1899 subroutine gwr_free(gwr) 1900 1901 !Arguments ------------------------------------ 1902 class(gwr_t), intent(inout) :: gwr 1903 1904 ! ************************************************************************* 1905 1906 ABI_SFREE(gwr%ks_vbik) 1907 ABI_SFREE(gwr%kbz) 1908 ABI_SFREE(gwr%kbz2ibz) 1909 ABI_SFREE(gwr%kbz2ibz_symrel) 1910 ABI_SFREE(gwr%qbz2ibz) 1911 ABI_SFREE(gwr%my_kbz_inds) 1912 ABI_SFREE(gwr%my_kibz_inds) 1913 ABI_SFREE(gwr%my_qbz_inds) 1914 ABI_SFREE(gwr%my_qibz_inds) 1915 ABI_SFREE(gwr%qbz) 1916 ABI_SFREE(gwr%qibz) 1917 ABI_SFREE(gwr%wtq) 1918 ABI_SFREE(gwr%chi0_head_myw) 1919 ABI_SFREE(gwr%chi0_uwing_myw) 1920 ABI_SFREE(gwr%chi0_lwing_myw) 1921 ABI_SFREE(gwr%qbz2ibz) 1922 ABI_SFREE(gwr%my_spins) 1923 ABI_SFREE(gwr%my_itaus) 1924 ABI_SFREE(gwr%tau_master) 1925 ABI_SFREE(gwr%np_kibz) 1926 ABI_SFREE(gwr%itreat_ikibz) 1927 ABI_SFREE(gwr%np_qibz) 1928 ABI_SFREE(gwr%itreat_iqibz) 1929 !#ifdef __HAVE_GREENX 1930 ABI_SFREE_NOCOUNT(gwr%tau_mesh) 1931 ABI_SFREE_NOCOUNT(gwr%tau_wgs) 1932 ABI_SFREE_NOCOUNT(gwr%iw_mesh) 1933 ABI_SFREE_NOCOUNT(gwr%iw_wgs) 1934 ABI_SFREE_NOCOUNT(gwr%cosft_tw) 1935 ABI_SFREE_NOCOUNT(gwr%cosft_wt) 1936 ABI_SFREE_NOCOUNT(gwr%sinft_wt) 1937 !#endif 1938 ABI_SFREE(gwr%kcalc) 1939 ABI_SFREE(gwr%bstart_ks) 1940 ABI_SFREE(gwr%bstop_ks) 1941 ABI_SFREE(gwr%nbcalc_ks) 1942 ABI_SFREE(gwr%kcalc2ibz) 1943 ABI_SFREE(gwr%sigx_mat) 1944 ABI_SFREE(gwr%sigc_iw_mat) 1945 ABI_SFREE(gwr%chinpw_qibz) 1946 1947 call gwr%ks_gaps%free() 1948 call ebands_free(gwr%qp_ebands) 1949 call ebands_free(gwr%qp_ebands_prev) 1950 call gwr%kcalc_wfd%free() 1951 call gwr%wfk_hdr%free() 1952 1953 ! Free descriptors 1954 if (allocated(gwr%green_desc_kibz)) then 1955 call desc_array_free(gwr%green_desc_kibz) 1956 ABI_FREE(gwr%green_desc_kibz) 1957 end if 1958 if (allocated(gwr%tchi_desc_qibz)) then 1959 call desc_array_free(gwr%tchi_desc_qibz) 1960 ABI_FREE(gwr%tchi_desc_qibz) 1961 end if 1962 1963 ! Free PBLAS matrices 1964 if (allocated(gwr%gt_kibz)) then 1965 call slk_array_free(gwr%gt_kibz) 1966 ABI_FREE(gwr%gt_kibz) 1967 end if 1968 if (allocated(gwr%tchi_qibz)) then 1969 call slk_array_free(gwr%tchi_qibz) 1970 ABI_FREE(gwr%tchi_qibz) 1971 end if 1972 if (allocated(gwr%wc_qibz)) then 1973 call slk_array_free(gwr%wc_qibz) 1974 ABI_FREE(gwr%wc_qibz) 1975 end if 1976 if (allocated(gwr%sigc_kibz)) then 1977 call slk_array_free(gwr%sigc_kibz) 1978 ABI_FREE(gwr%sigc_kibz) 1979 end if 1980 ! Release the scalapack pressor. 1981 call gwr%g_slkproc%free() 1982 1983 if (allocated(gwr%ugb)) then 1984 call slk_array_free(gwr%ugb) 1985 ABI_FREE(gwr%ugb) 1986 end if 1987 !if (allocated(gwr%nato_ugb)) then 1988 ! call slk_array_free(gwr%nato_ugb) 1989 ! ABI_FREE(gwr%nato_ugb) 1990 !end if 1991 call gwr%gtau_slkproc%free() 1992 1993 ! datatypes. 1994 call gwr%ks_me%free() 1995 call gwr%vcgen%free() 1996 1997 if (allocated(gwr%degtab)) then 1998 call degtab_array_free(gwr%degtab) 1999 ABI_FREE(gwr%degtab) 2000 end if 2001 2002 ! Free MPI communicators 2003 call gwr%spin_comm%free(); call gwr%g_comm%free(); call gwr%tau_comm%free() 2004 call gwr%kpt_comm%free(); call gwr%gtau_comm%free(); call gwr%kg_comm%free() 2005 call gwr%kgt_comm%free(); call gwr%kts_comm%free(); call gwr%comm%free() 2006 2007 end subroutine gwr_free
m_gwr/gwr_gamma_gw [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_gamma_gw
FUNCTION
INPUTS
vpsp(nfft)=local psp (Hartree)
OUTPUT
SOURCE
8061 subroutine gwr_gamma_gw(gwr, nfftf, ngfftf, vpsp) 8062 8063 use m_gwrdm, only : calc_rdmx,calc_rdmc,natoccs,update_hdr_bst,print_tot_occ,get_chkprdm,& 8064 print_chkprdm,change_matrix,print_total_energy,print_band_energies 8065 use m_spacepar, only : hartre 8066 8067 !Arguments ------------------------------------ 8068 class(gwr_t),target,intent(inout) :: gwr 8069 integer,intent(in) :: nfftf 8070 !arrays 8071 integer,intent(in) :: ngfftf(18) 8072 real(dp),intent(in) :: vpsp(nfftf) 8073 8074 !Local variables------------------------------- 8075 !scalars 8076 integer,parameter :: master = 0, cplex1 = 1, tim_fourdp5 = 5 8077 integer :: spin, ikcalc, ik_ibz, ib, ib1, ib2, nkibz, nsppol, mband, ierr, b1gw, b2gw 8078 !real(dp) :: cpu, wall, gflops 8079 real(dp) :: evext_energy,den_int !,coef_hyb,exc_mbb_energy eh_energy, ekin_energy, 8080 real(dp) :: gsqcut,boxcut,ecutf 8081 character(len=500) :: msg 8082 type(hdr_type) :: Hdr_sigma 8083 !arrays 8084 integer :: units(2) 8085 real(dp),parameter :: k0(3) = zero 8086 !real(dp) :: kgw(3) ! kk_ibz(3), 8087 real(dp),allocatable :: nat_occs(:,:), gw_rhor(:,:), gw_rhog(:,:), gw_vhartr(:) 8088 complex(dpc),allocatable :: xrdm_k_full(:,:,:), rdm_k(:,:), pot_k(:,:), nateigv(:,:,:,:), old_ks_purex(:,:), new_hartr(:,:) 8089 complex(dp) :: omega_i(gwr%ntau) 8090 complex(dpc),allocatable :: sigcme_k(:,:,:,:) 8091 8092 ! ************************************************************************* 8093 8094 call gwr%run_g0w0(free_ugb=.False.) 8095 ! TODO: Might release some PBLAS memory for W at this point 8096 8097 ! This section is copied from m_sigma_driver with small changes in order to intergrate it with the gwr% object. 8098 associate (dtset => gwr%dtset, qp_ebands => gwr%qp_ebands, ks_me => gwr%ks_me, psps => gwr%psps, & 8099 Wfd_nato_master => gwr%kcalc_wfd, dtfil => gwr%dtfil, cryst => gwr%cryst) 8100 8101 units = [std_out, ab_out] 8102 nkibz = gwr%nkibz; nsppol = gwr%nsppol; b1gw = gwr%b1gw; b2gw = gwr%b2gw 8103 ! Don't take mband from ks_ebands but compute it from gwr%bstop_ks 8104 mband = maxval(gwr%bstop_ks) !; mband = gwr%ks_ebands%mband 8105 8106 ! Note: all subroutines of 70_gw/m_gwrdm.F90 are implemented assuming nsppol == 1 8107 ABI_CHECK(dtset%nsppol == 1, "1-RDM GW correction only implemented for restricted closed-shell calculations!") 8108 ABI_CHECK(.not. gwr%sig_diago, "sig_diago should be false") 8109 8110 ABI_CALLOC(nateigv, (mband, mband, nkibz, nsppol)) 8111 ABI_CALLOC(nat_occs, (mband, nkibz)) 8112 ABI_CALLOC(xrdm_k_full, (b1gw:b2gw, b1gw:b2gw, nkibz)) 8113 8114 write(msg,'(a34,2i9)')' Bands used for the GW 1RDM arrays',b1gw,b2gw 8115 call wrtout(units, msg) 8116 8117 do ik_ibz=1,nkibz 8118 do ib=b1gw,b2gw 8119 xrdm_k_full(ib,ib,ik_ibz) = qp_ebands%occ(ib,ik_ibz,1) 8120 end do 8121 do ib=1,mband 8122 ! Copy initial occ numbers (in principle 2 or 0 from KS-DFT) 8123 nat_occs(ib,ik_ibz) = qp_ebands%occ(ib,ik_ibz,1) 8124 ! Set to identity matrix 8125 nateigv(ib,ib,ik_ibz,1) = cone 8126 end do 8127 end do 8128 8129 omega_i = j_dpc * gwr%iw_mesh 8130 8131 do spin=1,gwr%nsppol 8132 do ikcalc=1,gwr%nkcalc ! TODO: Should be spin dependent! 8133 ! Index of the irred k-point 8134 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 8135 !kgw = gwr%kcalc(:, ikcalc) 8136 ! min and max band indices for GW corrections (for this k-point) 8137 ib1 = gwr%bstart_ks(ikcalc, spin); ib2 = gwr%bstop_ks(ikcalc, spin) 8138 8139 ! Compute Sigma_x - Vxc or DELTA Sigma_x - Vxc 8140 ! where DELTA Sigma_x = Sigma_x - hyb_parameter Vx^exact for hyb Functionals. 8141 ! NB: Only restricted closed-shell calcs are implemented here 8142 ABI_CALLOC(pot_k, (ib1:ib2, ib1:ib2)) 8143 ABI_CALLOC(rdm_k, (ib1:ib2, ib1:ib2)) 8144 pot_k(ib1:ib2,ib1:ib2) = gwr%sigx_mat(ib1:ib2,ib1:ib2,ik_ibz,spin) - ks_me%vxcval(ib1:ib2,ib1:ib2,ik_ibz,spin) 8145 call calc_rdmx(ib1, ib2, ik_ibz, pot_k, rdm_k, qp_ebands) 8146 8147 ! Update the full 1RDM with the exchange corrected one for this k-point 8148 xrdm_k_full(ib1:ib2,ib1:ib2,ik_ibz) = xrdm_k_full(ib1:ib2,ib1:ib2,ik_ibz) + rdm_k(ib1:ib2,ib1:ib2) 8149 8150 ! Compute NAT ORBS for exchange corrected 1-RDM 8151 ! Only restricted closed-shell calcs 8152 do ib=ib1,ib2 8153 rdm_k(ib,ib) = rdm_k(ib,ib) + qp_ebands%occ(ib,ik_ibz,1) 8154 end do 8155 call natoccs(ib1, ib2, rdm_k, nateigv, nat_occs, qp_ebands, ik_ibz, iinfo=0) 8156 8157 ! ================ 8158 ! Correlation part 8159 ! ================ 8160 ! TODO 8161 !ABI_CALLOC(sigcme_k, (gwr%ntau, ib2-ib1+1, ib2-ib1+1, nsppol*gwr%nsig_ab)) 8162 !gwr%sigc_iw_mat((gwr%ntau, ib1:, ib1:, nsppol*gwr%nsig_ab)) 8163 call calc_rdmc(ib1, ib2, ik_ibz, omega_i, gwr%iw_wgs, sigcme_k, qp_ebands, rdm_k) 8164 !ABI_FREE(sigcme_k) 8165 8166 ! Update the full 1RDM with the GW corrected one for this k-point 8167 ! Only restricted closed-shell calcs 8168 rdm_k(ib1:ib2,ib1:ib2) = xrdm_k_full(ib1:ib2,ib1:ib2,ik_ibz) + rdm_k(ib1:ib2,ib1:ib2) 8169 ! Compute nat orbs and occ numbers at k-point ik_ibz 8170 call natoccs(ib1, ib2, rdm_k, nateigv, nat_occs, qp_ebands, ik_ibz, iinfo=1) 8171 8172 ABI_FREE(pot_k) 8173 ABI_FREE(rdm_k) 8174 end do ! ikcalc 8175 end do ! spin 8176 8177 ABI_CALLOC(gw_rhor, (nfftf, dtset%nspden)) 8178 call hdr_copy(gwr%wfk_hdr, hdr_sigma) 8179 8180 ! NRM WARNING: only the master has bands on Wfd_nato_master so it prints everything and computes gw_rhor 8181 ! 8182 ! All procs. update the qp_ebands and the Hdr_sigma 8183 call update_hdr_bst(Wfd_nato_master, nat_occs, b1gw, b2gw, qp_ebands, Hdr_sigma, Dtset%ngfft(1:3)) 8184 8185 ! Compute unit cell (averaged) occ = \sum _k weight_k occ_k 8186 call print_tot_occ(qp_ebands) 8187 8188 if (gwr%comm%me == master) then 8189 call Wfd_nato_master%rotate(cryst, nateigv) !, bmask=bdm_mask) ! Let it use bdm_mask and build NOs 8190 call Wfd_nato_master%mkrho(cryst, psps, qp_ebands, ngfftf, nfftf, gw_rhor) ! Construct the density 8191 if (dtset%prtwf == 1) then 8192 ! Print WFK file, here qp_ebands contains nat. orb. occs. 8193 call Wfd_nato_master%write_wfk(Hdr_sigma, qp_ebands, dtfil%fnameabo_wfk, wfknocheck=.True.) 8194 end if 8195 if (dtset%prtden == 1) then 8196 ! Print DEN file 8197 call fftdatar_write("density",dtfil%fnameabo_den,dtset%iomode,Hdr_sigma,& 8198 Cryst,ngfftf,cplex1,nfftf,dtset%nspden,gw_rhor,gwr%mpi_enreg,ebands=qp_ebands) 8199 end if 8200 end if 8201 call xmpi_bcast(gw_rhor, master, gwr%comm%value, ierr) 8202 call hdr_sigma%free() 8203 8204 ! Compute energies only if all k-points are available 8205 ! We need the hole 1-RDM to build Fock[GW.1RDM]! 8206 ABI_CALLOC(old_ks_purex, (b1gw:b2gw, gwr%nkcalc)) 8207 ABI_CALLOC(new_hartr, (b1gw:b2gw, gwr%nkcalc)) 8208 ABI_CALLOC(gw_rhog, (2, nfftf)) 8209 ABI_CALLOC(gw_vhartr, (nfftf)) 8210 ! 8211 ! A) Compute Evext = int rho(r) vext(r) dr -> simply dot product on the FFT grid 8212 ! Only restricted closed-shell calcs 8213 ! 8214 den_int = sum(gw_rhor(:,1)) * cryst%ucvol / nfftf 8215 evext_energy = sum(gw_rhor(:,1) * vpsp(:)) * cryst%ucvol / nfftf 8216 ! 8217 ! B) Coulomb <KS_i|Vh[NO]|KS_j> 8218 ! 8219 ! FFT to build gw_rhog 8220 call fourdp(1, gw_rhog, gw_rhor(:,1), -1, gwr%mpi_enreg, nfftf, ndat1, ngfftf, tim_fourdp5) 8221 8222 ecutf = dtset%ecut 8223 if (psps%usepaw == 1) then 8224 ecutf = dtset%pawecutdg 8225 call wrtout(std_out, ch10//' FFT (fine) grid used in PAW GW update:') 8226 end if 8227 8228 call getcut(boxcut, ecutf, cryst%gmet, gsqcut, dtset%iboxcut, std_out, k0, ngfftf) 8229 call hartre(1, gsqcut, dtset%icutcoul, psps%usepaw, gwr%mpi_enreg, nfftf, ngfftf, dtset%nkpt, dtset%rcut, & 8230 gw_rhog, cryst%rprimd, dtset%vcutgeo, gw_vhartr) 8231 8232 ! TODO 8233 8234 ABI_FREE(nateigv) 8235 ABI_FREE(nat_occs) 8236 ABI_FREE(xrdm_k_full) 8237 ABI_FREE(gw_rhor) 8238 ABI_FREE(old_ks_purex) 8239 ABI_FREE(new_hartr) 8240 ABI_FREE(gw_rhog) 8241 ABI_FREE(gw_vhartr) 8242 end associate 8243 8244 end subroutine gwr_gamma_gw
m_gwr/gwr_get_gkbz_rpr_pm [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_get_gkbz_rpr_pm
FUNCTION
Compute G_k(r',r) from G_k(g,g') for k in the BZ and given spin and tau. Note that output matrix `gk_rpr_pm` is transposed i.e. (r',r) instead of (r,r').
INPUTS
OUTPUT
SOURCE
2978 subroutine gwr_get_gkbz_rpr_pm(gwr, ik_bz, itau, spin, gk_rpr_pm, g0, ipm_list) 2979 2980 !Arguments ------------------------------------ 2981 class(gwr_t),intent(in) :: gwr 2982 integer,intent(in) :: ik_bz, itau, spin 2983 type(__slkmat_t),intent(inout) :: gk_rpr_pm(2) 2984 integer,optional,intent(in) :: g0(3), ipm_list(:) 2985 2986 !Local variables------------------------------- 2987 !scalars 2988 integer :: ig2, ipm, npwsp, col_bsize, ir1, idat, ndat, ii, num_pm, ipm_list__(2) 2989 logical :: have_g0 2990 !real(dp) :: cpu, wall, gflops 2991 type(__slkmat_t) :: rgp, gt_pm(2), gpr 2992 type(desc_t) :: desc_kbz 2993 type(uplan_t) :: uplan_k 2994 complex(gwpc),allocatable :: ceig0r(:) 2995 character(len=500) :: msg 2996 2997 ! ************************************************************************* 2998 2999 !call cwtime(cpu, wall, gflops, "start") 3000 3001 num_pm = 2; ipm_list__ = [1, 2] 3002 if (present(ipm_list)) then 3003 num_pm = size(ipm_list) 3004 ABI_CHECK_IRANGE(num_pm, 1, 2, "num_pm not in [1, 2]") 3005 ipm_list__(1:num_pm) = ipm_list 3006 end if 3007 3008 have_g0 = .False. 3009 if (present(g0)) then 3010 ! NB: Non-zero g0, requires the application of the phase. 3011 if (any(g0 /= 0)) then 3012 have_g0 = .True. 3013 ABI_MALLOC(ceig0r, (gwr%g_nfft * gwr%nspinor)) 3014 call calc_ceigr(-g0, gwr%g_nfft, gwr%nspinor, gwr%g_ngfft, ceig0r) 3015 end if 3016 end if 3017 3018 ! Get G_k(g,g', +/- itau) in the BZ. 3019 call gwr%rotate_gpm(ik_bz, itau, spin, desc_kbz, gt_pm, ipm_list=ipm_list__) 3020 3021 call uplan_k%init(desc_kbz%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, desc_kbz%istwfk, & 3022 desc_kbz%gvec, gwpc, gwr%dtset%gpu_option) 3023 3024 ! For each tau in imp_list__ 3025 do ii=1,num_pm 3026 ipm = ipm_list__(ii) 3027 ! Allocate temporary rgp PBLAS matrix to store G(r,g') 3028 npwsp = desc_kbz%npw * gwr%nspinor 3029 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 3030 call rgp%init(gwr%g_nfft * gwr%nspinor, npwsp, gwr%g_slkproc, desc_kbz%istwfk, size_blocs=[-1, col_bsize]) 3031 3032 associate (g_gp => gt_pm(ipm)) 3033 do ig2=1, g_gp%sizeb_local(2), gwr%uc_batch_size 3034 ! G_k(g,g') -> G_k(r,g') and store results in rgp. 3035 ndat = blocked_loop(ig2, g_gp%sizeb_local(2), gwr%uc_batch_size) 3036 call uplan_k%execute_gr(ndat, g_gp%buffer_cplx(:,ig2), rgp%buffer_cplx(:,ig2)) 3037 3038 ! Multiply by e^{ig0.r} 3039 if (have_g0) then 3040 do idat=0,ndat-1 3041 rgp%buffer_cplx(:,ig2+idat) = ceig0r(:) * rgp%buffer_cplx(:,ig2+idat) 3042 end do 3043 end if 3044 end do ! ig2 3045 end associate 3046 3047 ! MPI transpose: G_k(r,g') -> G_k(g',r) and transform g' index. 3048 call rgp%ptrans("N", gpr, free=.True.) 3049 3050 do ir1=1, gpr%sizeb_local(2), gwr%uc_batch_size 3051 ! G_k(g',r) -> G_k(r',r) and store results in rgp. 3052 ndat = blocked_loop(ir1, gpr%sizeb_local(2), gwr%uc_batch_size) 3053 call uplan_k%execute_gr(ndat, gpr%buffer_cplx(:,ir1), gk_rpr_pm(ipm)%buffer_cplx(:,ir1), isign=-1, iscale=0) 3054 3055 ! Multiply by e^{ig0.r} 3056 if (have_g0) then 3057 do idat=0,ndat-1 3058 gk_rpr_pm(ipm)%buffer_cplx(:, ir1+idat) = conjg(ceig0r) * gk_rpr_pm(ipm)%buffer_cplx(:, ir1+idat) 3059 end do 3060 end if 3061 end do ! ir1 3062 call gpr%free() 3063 3064 ! Rescale? 3065 !gk_rpr_pm(ipm)%buffer_cplx = gk_rpr_pm(ipm)%buffer_cplx * gwr%g_nfft 3066 end do ! ipm 3067 3068 call slk_array_free(gt_pm); call desc_kbz%free(); call uplan_k%free() 3069 3070 ABI_SFREE(ceig0r) 3071 !call cwtime_report(" gwr_get_gkbz_rpr_pm:", cpu, wall, gflops) 3072 3073 end subroutine gwr_get_gkbz_rpr_pm
m_gwr/gwr_get_myk_green_gpr [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_get_myk_green_gpr
FUNCTION
Use FFTs to compute G_k(g,g') --> G_k(g',r) for each k in the BZ treated by this MPI proc for given spin and tau. 1) FFT Transform the first index and multiply by e^{ik.r}: G_k(g,g') --> G_k(r,g') = e^{ik.r} \sum_g e^{ig.r} G_k(g,g') NB: This is a local operation. 2) MPI transpose the matrix to go from (r,g') to (g',r) distribution.
INPUTS
OUTPUT
SOURCE
2882 subroutine gwr_get_myk_green_gpr(gwr, itau, spin, desc_mykbz, gt_gpr) 2883 2884 !Arguments ------------------------------------ 2885 class(gwr_t),intent(in) :: gwr 2886 integer,intent(in) :: itau, spin 2887 type(desc_t),intent(out) :: desc_mykbz(gwr%my_nkbz) 2888 type(__slkmat_t),intent(inout) :: gt_gpr(2, gwr%my_nkbz) 2889 2890 !Local variables------------------------------- 2891 !scalars 2892 integer :: my_ikf, ik_bz, ig2, ipm, npwsp, col_bsize, idat, ndat 2893 logical :: k_is_gamma 2894 real(dp) :: kk_bz(3), cpu, wall, gflops, mem_mb 2895 complex(gwpc),allocatable :: ceikr(:) 2896 character(len=500) :: msg 2897 type(__slkmat_t) :: rgp, gt_pm(2) 2898 type(uplan_t) :: uplan_k 2899 2900 ! ************************************************************************* 2901 2902 call cwtime(cpu, wall, gflops, "start") 2903 2904 !mem_mb = two * gwr%my_nkbz * two * gwpc * gwr%g_nfft * gwr%gree_mpw * b2Mb / gwr%g_slkproc%nbprocs 2905 !call wrtout(std_out, sjoin("Local memory for Green's functions: ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 2906 2907 ABI_MALLOC(ceikr, (gwr%g_nfft * gwr%nspinor)) 2908 2909 do my_ikf=1,gwr%my_nkbz 2910 ik_bz = gwr%my_kbz_inds(my_ikf) 2911 kk_bz = gwr%kbz(:, ik_bz) 2912 k_is_gamma = normv(kk_bz, gwr%cryst%gmet, "G") < GW_TOLQ0 2913 if (.not. k_is_gamma) call calc_ceikr(kk_bz, gwr%g_ngfft, gwr%g_nfft, gwr%nspinor, ceikr) 2914 2915 ! Get G_kbz(+/- itau) in the BZ. 2916 call gwr%rotate_gpm(ik_bz, itau, spin, desc_mykbz(my_ikf), gt_pm) 2917 2918 associate (desc_k => desc_mykbz(my_ikf)) 2919 call uplan_k%init(desc_k%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, desc_k%istwfk, & 2920 desc_k%gvec, gwpc, gwr%dtset%gpu_option) 2921 2922 do ipm=1,2 2923 ! Allocate rgp PBLAS matrix to store G_kbz(r,g') 2924 associate (g_gp => gt_pm(ipm)) 2925 npwsp = desc_k%npw * gwr%nspinor 2926 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 2927 call rgp%init(gwr%g_nfft * gwr%nspinor, npwsp, gwr%g_slkproc, desc_k%istwfk, size_blocs=[-1, col_bsize]) 2928 !ABI_CHECK_IEQ(size(g_gp%buffer_cplx, dim=2), size(rgp%buffer_cplx, dim=2), "len2") 2929 2930 ! Perform FFT G_k(g,g') -> G_k(r,g') and store results in rgp. 2931 do ig2=1, g_gp%sizeb_local(2), gwr%uc_batch_size 2932 ndat = blocked_loop(ig2, g_gp%sizeb_local(2), gwr%uc_batch_size) 2933 call uplan_k%execute_gr(ndat, g_gp%buffer_cplx(:, ig2), rgp%buffer_cplx(:, ig2)) 2934 2935 if (.not. k_is_gamma) then 2936 ! Multiply by e^{ik.r} 2937 !$OMP PARALLEL DO 2938 do idat=0,ndat-1 2939 rgp%buffer_cplx(:, ig2 + idat) = ceikr(:) * rgp%buffer_cplx(:, ig2 + idat) 2940 end do 2941 end if 2942 end do ! ig2 2943 2944 ! MPI transpose: G_k(r,g') -> G_k(g',r) 2945 call rgp%ptrans("N", gt_gpr(ipm, my_ikf), free=.True.) 2946 end associate 2947 end do ! ipm 2948 2949 call uplan_k%free(); call slk_array_free(gt_pm) 2950 end associate 2951 end do ! my_ikf 2952 2953 mem_mb = sum(slk_array_locmem_mb(gt_gpr)) 2954 call wrtout(std_out, sjoin(" Local memory for G_kbz(g',r,itau): ", ftoa(mem_mb, fmt="f8.1"), "[Mb] <<< MEM")) 2955 2956 ABI_FREE(ceikr) 2957 call cwtime_report(" gwr_get_myk_green_gpr:", cpu, wall, gflops) 2958 2959 end subroutine gwr_get_myk_green_gpr
m_gwr/gwr_get_myq_wc_gpr [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_get_myq_wc_gpr
FUNCTION
Use FFTs to compute: Wc_q(g,g') --> Wc_q(g',r) for each q in the BZ treated by this MPI proc for given `spin` and `itau` index: 1) FFT Transform the first index: Wc(g,g',it) --> Wc(r,g',it) (local operation) 2) MPI transposition: Wc(r,g',it) --> Wc(g',r,it)
INPUTS
OUTPUT
SOURCE
3326 subroutine gwr_get_myq_wc_gpr(gwr, itau, spin, desc_myqbz, wc_gpr) 3327 3328 !Arguments ------------------------------------ 3329 class(gwr_t),intent(inout) :: gwr 3330 integer,intent(in) :: itau, spin 3331 type(desc_t),target,intent(out) :: desc_myqbz(gwr%my_nqbz) 3332 type(__slkmat_t),intent(inout) :: wc_gpr(gwr%my_nqbz) 3333 3334 !Local variables------------------------------- 3335 !scalars 3336 integer :: my_iqf, iq_bz, ig2, npwsp, col_bsize, idat, ndat 3337 real(dp) :: cpu, wall, gflops, mem_mb, qq_bz(3) 3338 logical :: q_is_gamma 3339 character(len=500) :: msg 3340 type(__slkmat_t) :: rgp, wc_qbz 3341 type(uplan_t) :: uplan_q 3342 complex(gwpc),allocatable :: ceiqr(:) 3343 3344 ! ************************************************************************* 3345 3346 call cwtime(cpu, wall, gflops, "start") 3347 ABI_MALLOC(ceiqr, (gwr%g_nfft * gwr%nspinor)) 3348 3349 do my_iqf=1,gwr%my_nqbz 3350 iq_bz = gwr%my_qbz_inds(my_iqf); qq_bz = gwr%qbz(:, iq_bz) 3351 q_is_gamma = normv(qq_bz, gwr%cryst%gmet, "G") < GW_TOLQ0 3352 if (.not. q_is_gamma) call calc_ceikr(qq_bz, gwr%g_ngfft, gwr%g_nfft, gwr%nspinor, ceiqr) 3353 3354 ! Get Wc_q in the BZ. 3355 call gwr%rotate_wc(iq_bz, itau, spin, desc_myqbz(my_iqf), wc_qbz) 3356 associate (desc_q => desc_myqbz(my_iqf)) 3357 3358 ! Allocate rgp PBLAS matrix to store Wc_q(r, g') 3359 npwsp = desc_q%npw * gwr%nspinor 3360 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 3361 call rgp%init(gwr%g_nfft * gwr%nspinor, npwsp, gwr%g_slkproc, desc_q%istwfk, size_blocs=[-1, col_bsize]) 3362 3363 call uplan_q%init(desc_q%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, desc_q%istwfk, & 3364 desc_q%gvec, gwpc, gwr%dtset%gpu_option) 3365 3366 ! FFT and store results in rgp 3367 do ig2=1,wc_qbz%sizeb_local(2), gwr%uc_batch_size 3368 ndat = blocked_loop(ig2, wc_qbz%sizeb_local(2), gwr%uc_batch_size) 3369 call uplan_q%execute_gr(ndat, wc_qbz%buffer_cplx(:, ig2), rgp%buffer_cplx(:, ig2)) 3370 3371 ! Multiply by e^{iq.r} 3372 if (.not. q_is_gamma) then 3373 !$OMP PARALLEL DO 3374 do idat=0,ndat-1 3375 rgp%buffer_cplx(:, ig2+idat) = ceiqr(:) * rgp%buffer_cplx(:, ig2+idat) 3376 end do 3377 end if 3378 end do ! ig2 3379 3380 call uplan_q%free() 3381 3382 ! MPI transposition: Wc(r,g') -> Wc(g',r) 3383 call rgp%ptrans("N", wc_gpr(my_iqf), free=.True.) 3384 end associate 3385 call wc_qbz%free() 3386 end do ! my_iqf 3387 ABI_FREE(ceiqr) 3388 3389 mem_mb = sum(slk_array_locmem_mb(wc_gpr)) 3390 call wrtout(std_out, sjoin(" Local memory for Wc(g',r):", ftoa(mem_mb, fmt="f8.1"), "[Mb] <<< MEM")) 3391 call cwtime_report(" gwr_get_myq_wc_gpr:", cpu, wall, gflops) 3392 3393 end subroutine gwr_get_myq_wc_gpr
m_gwr/gwr_get_u_ngfft [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_get_u_ngfft
FUNCTION
Compute FFT mesh from boxcutmin.
INPUTS
OUTPUT
SOURCE
7857 subroutine gwr_get_u_ngfft(gwr, boxcutmin, u_ngfft, u_nfft, u_mgfft, u_mpw, gmax) 7858 7859 !Arguments ------------------------------------ 7860 class(gwr_t),intent(in) :: gwr 7861 real(dp),intent(in) :: boxcutmin 7862 integer,intent(out) :: u_ngfft(18), u_nfft, u_mgfft, u_mpw, gmax(3) 7863 7864 !Local variables------------------------------- 7865 integer :: ik_bz, npw_, ig, ii 7866 real(dp) :: kk_bz(3) 7867 integer,allocatable :: gvec_(:,:) 7868 7869 ! ************************************************************************* 7870 7871 ! All MPI procs in gwr%comm execute this part. 7872 ! Note the loops over the full BZ to compute u_mpw 7873 ! FIXME: umklapp, ecutsigx and q-centered G-sphere 7874 ! TODO: Write new routine to compute best FFT mesh for ecut1 + ecut1. See set_mesh from GW code. 7875 7876 u_ngfft = gwr%dtset%ngfft ! This to allow users to specify fftalg 7877 7878 u_mpw = -1; gmax = 0 7879 do ik_bz=1,gwr%nkbz 7880 kk_bz = gwr%kbz(:, ik_bz) 7881 call get_kg(kk_bz, istwfk1, gwr%dtset%ecut, gwr%cryst%gmet, npw_, gvec_) 7882 u_mpw = max(u_mpw, npw_) 7883 ! TODO: g0 umklapp here can enter into play gmax may not be large enough! 7884 do ig=1,npw_ 7885 do ii=1,3 7886 gmax(ii) = max(gmax(ii), abs(gvec_(ii, ig))) 7887 end do 7888 end do 7889 ABI_FREE(gvec_) 7890 call getng(boxcutmin, gwr%dtset%chksymtnons, gwr%dtset%ecut, gwr%cryst%gmet, & 7891 kk_bz, me_fft0, u_mgfft, u_nfft, u_ngfft, nproc_fft1, gwr%cryst%nsym, paral_fft0, & 7892 gwr%cryst%symrel, gwr%cryst%tnons, gpu_option=gwr%dtset%gpu_option, unit=dev_null) 7893 end do 7894 7895 end subroutine gwr_get_u_ngfft
m_gwr/gwr_get_wc_rpr_qbz [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_get_wc_rpr_qbz
FUNCTION
Compute Wc_q(r',r') for q in the BZ
INPUTS
OUTPUT
SOURCE
3411 subroutine gwr_get_wc_rpr_qbz(gwr, g0_q, iq_bz, itau, spin, wc_rpr) 3412 3413 !Arguments ------------------------------------ 3414 class(gwr_t),intent(inout) :: gwr 3415 integer,intent(in) :: g0_q(3) 3416 integer,intent(in) :: iq_bz, itau, spin 3417 type(__slkmat_t),intent(inout) :: wc_rpr 3418 3419 !Local variables------------------------------- 3420 !scalars 3421 integer :: ig2, npwsp, nrsp, col_bsize, ir1, ndat, idat 3422 character(len=500) :: msg 3423 type(desc_t) :: desc_qbz 3424 type(__slkmat_t) :: wc_ggp, rgp, gpr 3425 type(uplan_t) :: uplan_k 3426 complex(gwpc),allocatable :: ceig0r(:) 3427 ! ************************************************************************* 3428 3429 ! NB: Non-zero g0, requires the application of the phase. 3430 if (any(g0_q /= 0)) then 3431 ABI_MALLOC(ceig0r, (gwr%g_nfft * gwr%nspinor)) 3432 call calc_ceigr(-g0_q, gwr%g_nfft, gwr%nspinor, gwr%g_ngfft, ceig0r) 3433 end if 3434 3435 ! Get W_q(g,g') in the BZ. 3436 call gwr%rotate_wc(iq_bz, itau, spin, desc_qbz, wc_ggp) 3437 3438 ! Allocate rgp PBLAS matrix to store Wc(r,g') 3439 nrsp = gwr%g_nfft * gwr%nspinor 3440 npwsp = desc_qbz%npw * gwr%nspinor 3441 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 3442 call rgp%init(nrsp, npwsp, gwr%g_slkproc, desc_qbz%istwfk, size_blocs=[-1, col_bsize]) 3443 3444 call uplan_k%init(desc_qbz%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, desc_qbz%istwfk, & 3445 desc_qbz%gvec, gwpc, gwr%dtset%gpu_option) 3446 3447 ! FFT Wc(g,g') -> Wc(r,g') and store results in rgp 3448 do ig2=1,wc_ggp%sizeb_local(2), gwr%uc_batch_size 3449 ndat = blocked_loop(ig2, wc_ggp%sizeb_local(2), gwr%uc_batch_size) 3450 call uplan_k%execute_gr(ndat, wc_ggp%buffer_cplx(:,ig2), rgp%buffer_cplx(:,ig2)) 3451 3452 ! Multiply by e^{ig0.r} 3453 if (any(g0_q /= 0)) then 3454 do idat=0,ndat-1 3455 rgp%buffer_cplx(:, ig2+idat) = ceig0r(:) * rgp%buffer_cplx(:, ig2+idat) 3456 end do 3457 end if 3458 end do ! ig2 3459 3460 ! MPI transpose: Wc(r,g') -> Wc(g',r) 3461 call rgp%ptrans("N", gpr, free=.True.) 3462 3463 ! Wc_q(g',r) -> Wc_q(r',r) and store results in wc_rgp. 3464 do ir1=1,gpr%sizeb_local(2), gwr%uc_batch_size 3465 ndat = blocked_loop(ir1, gpr%sizeb_local(2), gwr%uc_batch_size) 3466 call uplan_k%execute_gr(ndat, gpr%buffer_cplx(:, ir1), wc_rpr%buffer_cplx(:, ir1), isign=-1, iscale=0) 3467 3468 ! Multiply by e^{ig0.r} 3469 if (any(g0_q /= 0)) then 3470 do idat=0,ndat-1 3471 wc_rpr%buffer_cplx(:, ir1+idat) = conjg(ceig0r) * wc_rpr%buffer_cplx(:, ir1+idat) 3472 end do 3473 end if 3474 end do ! ir1 3475 3476 call uplan_k%free(); call gpr%free(); call desc_qbz%free(); call wc_ggp%free() 3477 3478 ABI_SFREE(ceig0r) 3479 3480 end subroutine gwr_get_wc_rpr_qbz
m_gwr/gwr_gk_to_scbox [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_gk_to_scbox
FUNCTION
INPUTS
OUTPUT
SOURCE
2522 subroutine gwr_gk_to_scbox(gwr, sc_ngfft, desc_mykbz, green_scgvec, my_ir, ndat, gt_gpr, gt_scbox, gt_scbox_win) 2523 2524 !Arguments ------------------------------------ 2525 class(gwr_t),target,intent(in) :: gwr 2526 integer,intent(in) :: sc_ngfft(18) 2527 integer,intent(out) :: green_scgvec(3, gwr%green_mpw) 2528 type(desc_t),intent(inout) :: desc_mykbz(gwr%my_nkbz) 2529 type(__slkmat_t),intent(in) :: gt_gpr(2, gwr%my_nkbz) 2530 integer,intent(in) :: my_ir, ndat 2531 complex(gwpc),intent(out) :: gt_scbox(product(sc_ngfft(4:6))*gwr%nspinor, gwr%sc_batch_size, 2) 2532 !complex(gwpc),intent(out) :: gt_scbox(:,:,:) 2533 integer,optional,intent(inout) :: gt_scbox_win 2534 2535 !Local variables------------------------------- 2536 integer :: my_ikf, ik_bz, ipm, gg(3), idat, iepoch, ii, idat_list(gwr%kpt_comm%nproc) ! ig, 2537 !real(dp) :: tsec(2) !, cpu, wall, gflops 2538 2539 ! ************************************************************************* 2540 2541 !call cwtime(cpu, wall, gflops, "start") 2542 !call timab(1929, 1, tsec) 2543 2544 ! Take the union of (k,g') for k in the BZ. 2545 ! Note gwr%ngkpt instead of gwr%ngqpt. 2546 if (.not. present(gt_scbox_win)) then 2547 2548 gt_scbox = czero_gw 2549 do my_ikf=1,gwr%my_nkbz 2550 ik_bz = gwr%my_kbz_inds(my_ikf); gg = nint(gwr%kbz(:,ik_bz) * gwr%ngkpt) 2551 #if 1 2552 do ipm=1,2 2553 call desc_mykbz(my_ikf)%to_scbox(gwr%kbz(:,ik_bz), gwr%ngkpt, sc_ngfft, gwr%nspinor*ndat, & 2554 gt_gpr(ipm, my_ikf)%buffer_cplx(1,my_ir), gt_scbox(:,:,ipm)) 2555 end do 2556 #else 2557 associate (desc_k => desc_mykbz(my_ikf)) 2558 do ig=1,desc_k%npw 2559 green_scgvec(:,ig) = gg + gwr%ngkpt * desc_k%gvec(:,ig) ! k+g 2560 end do 2561 do ipm=1,2 2562 call gsph2box(sc_ngfft, desc_k%npw, gwr%nspinor*ndat, green_scgvec, & 2563 gt_gpr(ipm, my_ikf)%buffer_cplx(1,my_ir), gt_scbox(:,:,ipm)) 2564 end do 2565 end associate 2566 #endif 2567 end do ! my_ikf 2568 2569 else 2570 ! Each MPI proc operates on a different idat vector at each epoch 2571 idat_list = cshift([(ii, ii=1,gwr%kpt_comm%nproc)], shift=-gwr%kpt_comm%me) 2572 2573 do iepoch=1,gwr%kpt_comm%nproc 2574 call xmpi_win_fence(gt_scbox_win) 2575 idat = idat_list(iepoch) 2576 if (idat > ndat) goto 10 2577 if (iepoch == 1) then 2578 do ipm=1,2 2579 gt_scbox(:,idat,ipm) = czero_gw 2580 end do 2581 end if 2582 2583 do my_ikf=1,gwr%my_nkbz 2584 ik_bz = gwr%my_kbz_inds(my_ikf); gg = nint(gwr%kbz(:, ik_bz) * gwr%ngkpt) 2585 #if 1 2586 do ipm=1,2 2587 call desc_mykbz(my_ikf)%to_scbox(gwr%kbz(:,ik_bz), gwr%ngkpt, sc_ngfft, gwr%nspinor * ndat1, & 2588 gt_gpr(ipm, my_ikf)%buffer_cplx(1,my_ir+idat-1), gt_scbox(:,idat,ipm)) 2589 end do 2590 #else 2591 associate (desc_k => desc_mykbz(my_ikf)) 2592 do ig=1,desc_k%npw 2593 green_scgvec(:,ig) = gg + gwr%ngkpt * desc_k%gvec(:,ig) ! k+g 2594 end do 2595 do ipm=1,2 2596 call gsph2box(sc_ngfft, desc_k%npw, gwr%nspinor * ndat1, green_scgvec, & 2597 gt_gpr(ipm, my_ikf)%buffer_cplx(1,my_ir+idat-1), gt_scbox(:,idat,ipm)) 2598 end do 2599 end associate 2600 #endif 2601 end do ! my_ikf 2602 10 continue 2603 !call xmpi_barrier(gwr%kpt_comm%value) 2604 !IF (.not. MPI_ASYNC_PROTECTS_NONBLOCKING) CALL MPI_F_SYNC_REG(gt_scbox) 2605 call xmpi_win_fence(gt_scbox_win) 2606 end do ! iepoch 2607 end if 2608 2609 !call cwtime_report(" gwr_gk_to_scbox:", cpu, wall, gflops) 2610 !call timab(1929, 2, tsec) 2611 2612 end subroutine gwr_gk_to_scbox
m_gwr/gwr_init [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_init
FUNCTION
Initialize the gwr object.
INPUTS
OUTPUT
SOURCE
835 subroutine gwr_init(gwr, dtset, dtfil, cryst, psps, pawtab, ks_ebands, mpi_enreg, input_comm) 836 837 !Arguments ------------------------------------ 838 !scalars 839 class(gwr_t),target,intent(out) :: gwr 840 type(dataset_type),target,intent(in) :: dtset 841 type(datafiles_type),target,intent(in) :: dtfil 842 type(crystal_t),target,intent(in) :: cryst 843 type(pseudopotential_type),target,intent(in) :: psps 844 type(pawtab_type),target,intent(in) :: pawtab(psps%ntypat*psps%usepaw) 845 type(ebands_t),target,intent(in) :: ks_ebands 846 type(mpi_type),target,intent(in) :: mpi_enreg 847 integer,intent(in) :: input_comm 848 849 !Local variables------------------------------- 850 !scalars 851 integer,parameter :: qptopt1 = 1, qtimrev1 = 1, master = 0, ndims = 4 852 integer :: my_it, my_ikf, ii, ebands_timrev, my_iki, my_iqi, itau, spin, my_iqf 853 integer :: my_nshiftq, iq_bz, iq_ibz, npw_, ncid, smat_bsize1, smat_bsize2 854 integer :: comm_cart, me_cart, ierr, all_nproc, my_rank, qprange_, gap_err, ncerr, omp_nt 855 integer :: cnt, ikcalc, ndeg, mband, bstop, nbsum, jj 856 integer :: ik_ibz, ik_bz, isym_k, trev_k, g0_k(3) 857 integer :: ip_g, ip_k, ip_t, ip_s, np_g, np_k, np_t, np_s 858 real(dp) :: cpu, wall, gflops, wmax, vc_ecut, delta, abs_rerr, exact_int, eval_int 859 real(dp) :: prev_efficiency, prev_speedup, regterm, prev_dual_error 860 logical :: isirr_k, changed, q_is_gamma, reorder 861 character(len=5000) :: msg 862 type(krank_t) :: qrank, krank_ibz 863 type(est_t) :: est 864 !arrays 865 integer :: qptrlatt(3,3), dims_kgts(ndims), try_dims_kgts(ndims), indkk_k(6,1), units(2) 866 integer,allocatable :: gvec_(:,:),degblock(:,:), degblock_all(:,:,:,:), ndeg_all(:,:), iwork(:,:), got(:) 867 real(dp) :: my_shiftq(3,1), kk_ibz(3), kk_bz(3), qq_bz(3), qq_ibz(3), kk(3), tsec(2) 868 real(dp),allocatable :: wtk(:), kibz(:,:) 869 logical :: periods(ndims), keepdim(ndims) 870 871 ! ************************************************************************* 872 873 call cwtime(cpu, wall, gflops, "start") 874 call timab(1920, 1, tsec) 875 876 all_nproc = xmpi_comm_size(input_comm); my_rank = xmpi_comm_rank(input_comm) 877 units = [std_out, ab_out] 878 879 ! Keep a reference to other objects to simplify the internal API. 880 gwr%dtset => dtset; gwr%dtfil => dtfil; gwr%cryst => cryst; gwr%psps => psps; gwr%pawtab => pawtab 881 gwr%ks_ebands => ks_ebands; gwr%kibz => ks_ebands%kptns; gwr%wtk => ks_ebands%wtk 882 gwr%mpi_enreg => mpi_enreg 883 884 ! Initialize qp_ebands with KS values. 885 call ebands_copy(ks_ebands, gwr%qp_ebands) 886 call ebands_copy(ks_ebands, gwr%qp_ebands_prev) 887 888 ABI_MALLOC(gwr%ks_vbik, (gwr%ks_ebands%nkpt, gwr%ks_ebands%nsppol)) 889 gwr%ks_vbik(:,:) = ebands_get_valence_idx(gwr%ks_ebands) 890 891 gwr%nspinor = dtset%nspinor; gwr%nsppol = dtset%nsppol; gwr%nspden = dtset%nspden; gwr%nsig_ab = gwr%nspinor ** 2 892 gwr%natom = dtset%natom; gwr%usepaw = dtset%usepaw 893 894 gwr%sig_diago = .True. 895 if (string_in(gwr%dtset%gwr_task, "GAMMA_GW")) gwr%sig_diago = .False. 896 897 ! Decide whether one should use supercells or convolutions in the BZ. 898 gwr%use_supercell_for_tchi = .True. 899 if (gwr%dtset%gwr_chi_algo == 0) then 900 ! Automatic selection 901 ABI_ERROR("Not implemented Error") 902 else 903 gwr%use_supercell_for_tchi = gwr%dtset%gwr_chi_algo == 1 904 end if 905 906 if (gwr%dtset%gwr_sigma_algo == 0) then 907 ! Automatic selection 908 ABI_ERROR("Not implemented Error") 909 else 910 gwr%use_supercell_for_sigma = gwr%dtset%gwr_sigma_algo == 1 911 end if 912 913 ! Set q0 914 if (dtset%gw_nqlwl /= 0) gwr%q0 = dtset%gw_qlwl(:, 1) 915 916 mband = ks_ebands%mband; nbsum = dtset%nband(1) 917 ABI_CHECK_IRANGE(nbsum, 1, mband, "Invalid nbsum") 918 919 !call gwr%pstat%from_pid() 920 921 ! Define frequency mesh for sigma(w_real) and spectral functions. 922 ! Note that in GWR computing quantities on the real-axis is really cheap 923 ! so we can use very dense meshes without affecting performance. 924 ! The default for nfresp and freqspmax is zero. 925 ! TODO: Perhaps we can make it optional as in legacy-GW. 926 wmax = dtset%freqspmax; if (abs(wmax) < tol6) wmax = 100 * eV_Ha 927 gwr%nwr = dtset%nfreqsp 928 if (gwr%nwr == 0) gwr%nwr = nint(wmax / (0.05 * eV_Ha)) 929 if (mod(gwr%nwr, 2) == 0) gwr%nwr = gwr%nwr + 1 930 gwr%wr_step = wmax / (gwr%nwr - 1) 931 932 ! ======================= 933 ! Setup k-mesh and q-mesh 934 ! ======================= 935 936 ! Get full kBZ associated to ks_ebands 937 call kpts_ibz_from_kptrlatt(cryst, ks_ebands%kptrlatt, ks_ebands%kptopt, ks_ebands%nshiftk, ks_ebands%shiftk, & 938 gwr%nkibz, kibz, wtk, gwr%nkbz, gwr%kbz) !, bz2ibz=bz2ibz) 939 !new_kptrlatt=gwr%kptrlatt, new_shiftk=gwr%kshift, 940 !bz2ibz=new%ind_qbz2ibz) # FIXME 941 ABI_FREE(wtk) 942 943 ! In principle kibz should be equal to ks_ebands%kptns. 944 ABI_CHECK_IEQ(gwr%nkibz, ks_ebands%nkpt, "nkibz != ks_ebands%nkpt") 945 ABI_CHECK(all(abs(ks_ebands%kptns - kibz) < tol12), "ks_ebands%kibz != kibz") 946 947 if (.not. (isdiagmat(ks_ebands%kptrlatt) .and. ks_ebands%nshiftk == 1)) then 948 ABI_ERROR("GWR code requires ngkpt with one shift!") 949 end if 950 gwr%ngkpt = get_diag(ks_ebands%kptrlatt) 951 952 ! Note symrec convention. 953 ebands_timrev = kpts_timrev_from_kptopt(ks_ebands%kptopt) 954 krank_ibz = krank_from_kptrlatt(gwr%nkibz, kibz, ks_ebands%kptrlatt, compute_invrank=.False.) 955 956 ABI_MALLOC(gwr%kbz2ibz, (6, gwr%nkbz)) 957 if (kpts_map("symrec", ebands_timrev, cryst, krank_ibz, gwr%nkbz, gwr%kbz, gwr%kbz2ibz) /= 0) then 958 ABI_ERROR("Cannot map kBZ to IBZ!") 959 end if 960 961 ! Order kbz by stars and rearrange entries in kbz2ibz table. 962 call kpts_pack_in_stars(gwr%nkbz, gwr%kbz, gwr%kbz2ibz) 963 964 if (my_rank == master) then 965 call kpts_map_print(units, " Mapping kBZ --> kIBZ", "symrec", gwr%kbz, kibz, gwr%kbz2ibz, gwr%dtset%prtvol) 966 end if 967 968 !call get_ibz2bz(gwr%nkibz, gwr%nkbz, gwr%kbz2ibz, kibz2bz, ierr) 969 !ABI_CHECK(ierr == 0, "Something wrong in symmetry tables for k-points") 970 971 ! Table with symrel conventions for the symmetrization of the wfs. 972 ABI_MALLOC(gwr%kbz2ibz_symrel, (6, gwr%nkbz)) 973 if (kpts_map("symrel", ebands_timrev, cryst, krank_ibz, gwr%nkbz, gwr%kbz, gwr%kbz2ibz_symrel) /= 0) then 974 ABI_ERROR("Cannot map kBZ to IBZ!") 975 end if 976 977 ! Setup qIBZ, weights and BZ. 978 ! Always use q --> -q symmetry even in systems without inversion 979 ! TODO: Might add input variable to rescale the q-mesh. 980 my_nshiftq = 1; my_shiftq = zero; qptrlatt = ks_ebands%kptrlatt 981 call kpts_ibz_from_kptrlatt(cryst, qptrlatt, qptopt1, my_nshiftq, my_shiftq, & 982 gwr%nqibz, gwr%qibz, gwr%wtq, gwr%nqbz, gwr%qbz) 983 !new_kptrlatt=gwr%qptrlatt, new_shiftk=gwr%qshift, 984 !bz2ibz=new%ind_qbz2ibz) # FIXME 985 986 ABI_CHECK(all(abs(gwr%qibz(:,1)) < tol16), "First qpoint in qibz should be Gamma!") 987 gwr%ngqpt = get_diag(qptrlatt) 988 989 ! HM: the bz2ibz produced above is incomplete, I do it here using listkk 990 ABI_MALLOC(gwr%qbz2ibz, (6, gwr%nqbz)) 991 992 qrank = krank_from_kptrlatt(gwr%nqibz, gwr%qibz, qptrlatt, compute_invrank=.False.) 993 994 if (kpts_map("symrec", qtimrev1, cryst, qrank, gwr%nqbz, gwr%qbz, gwr%qbz2ibz) /= 0) then 995 ABI_ERROR("Cannot map qBZ to IBZ!") 996 end if 997 call qrank%free() 998 999 ! Order qbz by stars and rearrange entries in qbz2ibz table. 1000 call kpts_pack_in_stars(gwr%nqbz, gwr%qbz, gwr%qbz2ibz) 1001 if (my_rank == master) then 1002 call kpts_map_print(units, " Mapping qBZ --> qIBZ", "symrec", gwr%qbz, gwr%qibz, gwr%qbz2ibz, gwr%dtset%prtvol) 1003 end if 1004 1005 ! ========================== 1006 ! Setup k-points in Sigma_nk 1007 ! ========================== 1008 gwr%ks_gaps = ebands_get_gaps(ks_ebands, gap_err) 1009 if (my_rank == master) then 1010 !call ebands_print(ks_ebands, header="KS band structure", unit=std_out, prtvol=gwr%dtset%prtvol) 1011 !call ebands_print_gaps(ks_ebands, ab_out, header="KS gaps (Fermi energy set to zero)") 1012 msg = "Kohn-Sham gaps and band edges from IBZ mesh" 1013 call gwr%ks_gaps%print(unit=std_out, header=msg) 1014 call gwr%ks_gaps%print(unit=ab_out, header=msg) 1015 end if 1016 1017 ! TODO: nkcalc should be spin dependent. 1018 ! This piece of code is taken from m_sigmaph. 1019 ! In principle one should use the same algorithm in setup_sigma (legacy GW code). 1020 if (dtset%nkptgw /= 0) then 1021 ! Treat the k-points and bands specified in the input file via kptgw and bdgw. 1022 call sigtk_kcalc_from_nkptgw(dtset, mband, gwr%nkcalc, gwr%kcalc, gwr%bstart_ks, gwr%nbcalc_ks) 1023 1024 else 1025 if (any(abs(dtset%sigma_erange) > zero)) then 1026 ! Use sigma_erange and (optionally) sigma_ngkpt 1027 call sigtk_kcalc_from_erange(dtset, cryst, ks_ebands, gwr%ks_gaps, & 1028 gwr%nkcalc, gwr%kcalc, gwr%bstart_ks, gwr%nbcalc_ks, input_comm) 1029 1030 else 1031 ! Use qp_range to select the interesting k-points and the corresponding bands. 1032 ! 1033 ! 0 --> Compute the QP corrections only for the fundamental and the direct gap. 1034 ! +num --> Compute the QP corrections for all the k-points in the irreducible zone and include `num` 1035 ! bands above and below the Fermi level. 1036 ! -num --> Compute the QP corrections for all the k-points in the irreducible zone. 1037 ! Include all occupied states and `num` empty states. 1038 1039 qprange_ = dtset%gw_qprange 1040 if (gap_err /= 0 .and. qprange_ == 0) then 1041 ABI_WARNING("Cannot compute fundamental and direct gap (likely metal). Will replace qprange 0 with qprange 1") 1042 qprange_ = 1 1043 end if 1044 1045 if (qprange_ /= 0) then 1046 call sigtk_kcalc_from_qprange(dtset, cryst, ks_ebands, qprange_, gwr%nkcalc, gwr%kcalc, gwr%bstart_ks, gwr%nbcalc_ks) 1047 else 1048 ! qprange is not specified in the input. 1049 ! Include direct and fundamental KS gap or include states depending on the position wrt band edges. 1050 call sigtk_kcalc_from_gaps(dtset, ks_ebands, gwr%ks_gaps, gwr%nkcalc, gwr%kcalc, gwr%bstart_ks, gwr%nbcalc_ks) 1051 end if 1052 end if 1053 1054 end if ! nkptgw /= 0 1055 1056 ! Include all degenerate states and map kcalc to the IBZ. NB: This part is copied from sigmaph. 1057 1058 ! The k-point and the symmetries connecting the BZ k-point to the IBZ. 1059 ABI_MALLOC(gwr%kcalc2ibz, (gwr%nkcalc, 6)) 1060 1061 ! Workspace arrays used to compute degeneracy tables. 1062 ABI_ICALLOC(degblock_all, (2, mband, gwr%nkcalc, gwr%nsppol)) 1063 ABI_ICALLOC(ndeg_all, (gwr%nkcalc, gwr%nsppol)) 1064 1065 ierr = 0 1066 do ikcalc=1,gwr%nkcalc 1067 ! Note symrel and use_symrel. 1068 ! These are the conventions for the symmetrization of the wavefunctions used in cgtk_rotate. 1069 kk = gwr%kcalc(:, ikcalc) 1070 1071 if (kpts_map("symrel", ebands_timrev, cryst, krank_ibz, 1, kk, indkk_k) /= 0) then 1072 write(msg, '(5a)' ) & 1073 "The WFK file cannot be used to compute self-energy corrections at k-point: ",trim(ktoa(kk)),ch10,& 1074 "The k-point cannot be generated from a symmetrical one.", ch10 1075 ABI_ERROR(msg) 1076 end if 1077 1078 ! TODO: Invert dims and update abipy 1079 gwr%kcalc2ibz(ikcalc, :) = indkk_k(:, 1) 1080 1081 ik_ibz = indkk_k(1,1); isym_k = indkk_k(2,1) 1082 trev_k = indkk_k(6,1); g0_k = indkk_k(3:5,1) 1083 isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 1084 !kk_ibz = ks_ebands%kptns(:,ik_ibz) 1085 if (.not. isirr_k) then 1086 ABI_WARNING(sjoin("The k-point in Sigma_{nk} must be in the IBZ but got:", ktoa(kk))) 1087 ierr = ierr + 1 1088 end if 1089 1090 ! We will have to average the QP corrections over degenerate states if symsigma=1 is used. 1091 ! Here we make sure that all the degenerate states are included. 1092 ! Store also band indices of the degenerate sets, used to average final results. 1093 if (abs(gwr%dtset%symsigma) == 1) then 1094 cnt = 0 1095 do spin=1,gwr%nsppol 1096 bstop = gwr%bstart_ks(ikcalc, spin) + gwr%nbcalc_ks(ikcalc, spin) - 1 1097 call ebands_enclose_degbands(ks_ebands, ik_ibz, spin, gwr%bstart_ks(ikcalc, spin), bstop, changed, TOL_EDIFF, & 1098 degblock=degblock) 1099 if (changed) then 1100 gwr%nbcalc_ks(ikcalc, spin) = bstop - gwr%bstart_ks(ikcalc, spin) + 1 1101 cnt = cnt + 1 1102 if (cnt < 5) then 1103 write(msg,'(2(a,i0),2a,2(1x,i0))') & 1104 "Not all the degenerate states for ikcalc: ",ikcalc,", spin: ",spin,ch10, & 1105 "were included in the bdgw set. bdgw has been automatically changed to: ",gwr%bstart_ks(ikcalc, spin), bstop 1106 ABI_COMMENT(msg) 1107 end if 1108 write(msg,'(2(a,i0),2a)') & 1109 "The number of included states: ", bstop, & 1110 " is larger than the number of bands in the input ",dtset%nband(ik_ibz + (spin-1)*ks_ebands%nkpt),ch10,& 1111 "Action: Increase nband." 1112 ABI_CHECK(bstop <= dtset%nband(ik_ibz + (spin-1)*ks_ebands%nkpt), msg) 1113 end if 1114 1115 ! Store band indices used for averaging (shifted by bstart_ks) 1116 ndeg = size(degblock, dim=2) 1117 ndeg_all(ikcalc, spin) = ndeg 1118 degblock_all(:, 1:ndeg, ikcalc, spin) = degblock(:, 1:ndeg) 1119 1120 ABI_FREE(degblock) 1121 end do 1122 end if ! symsigma 1123 end do ! ikcalc 1124 1125 ABI_CHECK(ierr == 0, "kptgw wavevectors must be in the IBZ read from the WFK file.") 1126 1127 ! Build degtab tables to average self-energy matrix element if symsigma /= 0 1128 if (abs(gwr%dtset%symsigma) == 1) then 1129 ABI_MALLOC(gwr%degtab, (gwr%nkcalc, gwr%nsppol)) 1130 do ikcalc=1,gwr%nkcalc 1131 do spin=1,gwr%nsppol 1132 ndeg = ndeg_all(ikcalc, spin) 1133 ABI_MALLOC(gwr%degtab(ikcalc, spin)%bids, (ndeg)) 1134 do ii=1,ndeg 1135 cnt = degblock_all(2, ii, ikcalc, spin) - degblock_all(1, ii, ikcalc, spin) + 1 1136 ABI_MALLOC(gwr%degtab(ikcalc, spin)%bids(ii)%vals, (cnt)) 1137 ! Here we start to count bands from bstart_ks(ikcalc, spin) 1138 !gwr%degtab(ikcalc, spin)%bids(ii)%vals = [(jj, jj= & 1139 ! degblock_all(1, ii, ikcalc, spin) - gwr%bstart_ks(ikcalc, spin) + 1, & 1140 ! degblock_all(2, ii, ikcalc, spin) - gwr%bstart_ks(ikcalc, spin) + 1)] 1141 ! Note that we start to count bands from bstart_ks(ikcalc, spin) 1142 gwr%degtab(ikcalc, spin)%bids(ii)%vals = [(jj, jj= & 1143 degblock_all(1, ii, ikcalc, spin), degblock_all(2, ii, ikcalc, spin))] 1144 end do 1145 end do 1146 end do 1147 end if 1148 1149 ABI_FREE(degblock_all) 1150 ABI_FREE(ndeg_all) 1151 1152 ! Now we can finally compute max_nbcalc 1153 gwr%max_nbcalc = maxval(gwr%nbcalc_ks) 1154 ABI_MALLOC(gwr%bstop_ks, (gwr%nkcalc, gwr%nsppol)) 1155 gwr%bstop_ks = gwr%bstart_ks + gwr%nbcalc_ks - 1 1156 gwr%b1gw = minval(gwr%bstart_ks); gwr%b2gw = maxval(gwr%bstop_ks) 1157 1158 call krank_ibz%free() 1159 ABI_FREE(kibz) ! Deallocate kibz here because krank_ibz keeps a reference to this array. 1160 1161 ! ================================ 1162 ! Setup tau/omega mesh and weights 1163 ! ================================ 1164 ! Compute min/max transition energy taking into account nsppol if any. 1165 gwr%te_min = minval(gwr%ks_gaps%cb_min - gwr%ks_gaps%vb_max) 1166 gwr%te_max = maxval(ks_ebands%eig(nbsum,:,:) - ks_ebands%eig(1,:,:)) 1167 if (gwr%te_min <= tol6) then 1168 gwr%te_min = tol6 1169 ABI_ERROR("System is metallic or with a very small fundamental gap!") 1170 end if 1171 gwr%ntau = dtset%gwr_ntau 1172 1173 !#ifdef __HAVE_GREENX 1174 regterm = dtset%gwr_regterm 1175 if (regterm > -tol16) then 1176 call wrtout(std_out, sjoin("Computing minimax grid with user-provided regterm:", ftoa(regterm))) 1177 call gx_minimax_grid(gwr%ntau, gwr%te_min, gwr%te_max, & ! in 1178 gwr%tau_mesh, gwr%tau_wgs, gwr%iw_mesh, gwr%iw_wgs, & ! out args allocated by the routine. 1179 gwr%cosft_wt, gwr%cosft_tw, gwr%sinft_wt, & 1180 gwr%ft_max_error, gwr%cosft_duality_error, ierr, regterm=regterm) 1181 ABI_CHECK(ierr == 0, "Error in gx_minimax_grid") 1182 else 1183 regterm = zero 1184 call wrtout(std_out, sjoin("Computing minimax grid with user-provided regterm:", ftoa(regterm))) 1185 call gx_minimax_grid(gwr%ntau, gwr%te_min, gwr%te_max, & ! in 1186 gwr%tau_mesh, gwr%tau_wgs, gwr%iw_mesh, gwr%iw_wgs, & ! out args allocated by the routine. 1187 gwr%cosft_wt, gwr%cosft_tw, gwr%sinft_wt, & 1188 gwr%ft_max_error, gwr%cosft_duality_error, ierr, regterm=regterm) 1189 ABI_CHECK(ierr == 0, "Error in gx_minimax_grid") 1190 1191 ! If duality error is big, use regterm = 1e-6 1192 if (gwr%cosft_duality_error > half) then 1193 ABI_SFREE(gwr%tau_mesh) 1194 ABI_SFREE(gwr%tau_wgs) 1195 ABI_SFREE(gwr%iw_mesh) 1196 ABI_SFREE(gwr%iw_wgs) 1197 ABI_SFREE(gwr%cosft_wt) 1198 ABI_SFREE(gwr%cosft_tw) 1199 ABI_SFREE(gwr%sinft_wt) 1200 regterm = tol6 1201 call wrtout(std_out, sjoin("LARGE duality error -> recomputing minimax grid with regterm:", ftoa(regterm))) 1202 prev_dual_error = gwr%cosft_duality_error 1203 call gx_minimax_grid(gwr%ntau, gwr%te_min, gwr%te_max, & ! in 1204 gwr%tau_mesh, gwr%tau_wgs, gwr%iw_mesh, gwr%iw_wgs, & ! out args allocated by the routine. 1205 gwr%cosft_wt, gwr%cosft_tw, gwr%sinft_wt, & 1206 gwr%ft_max_error, gwr%cosft_duality_error, ierr, regterm=regterm) 1207 ABI_CHECK(ierr == 0, "Error in gx_minimax_grid") 1208 1209 if (gwr%cosft_duality_error > prev_dual_error) then 1210 ABI_WARNING("Using regterm didn't decrease the duality error") 1211 end if 1212 end if 1213 end if 1214 1215 ! FIXME: Here we need to rescale the weights because greenx convention is not what we expect! 1216 !gwr%iw_wgs(:) = gwr%iw_wgs(:) / four 1217 1218 if (gwr%comm%me == 0) then 1219 write(std_out, "(3a)")ch10, " Computing F(delta) = \int_0^{\infty} dw / (w^2 + delta^2) = pi/2/delta ", ch10 1220 write(std_out, "(*(a12,2x))")"delta", "numeric", "exact", "abs_rerr (%)" 1221 do ii=1,10 1222 delta = (ii * gwr%te_min) 1223 eval_int = sum(gwr%iw_wgs(:) / (gwr%iw_mesh(:)**2 + delta**2)) 1224 exact_int = pi / (two * delta) 1225 abs_rerr = 100 * abs(eval_int - exact_int) / exact_int 1226 write(std_out, "(*(es12.5,2x))") delta, eval_int, exact_int, abs_rerr 1227 end do 1228 1229 write(std_out, "(3a)")ch10," Computing F(w) = \int_0^{\infty} e^{-w tau} dtau", ch10 1230 write(std_out, "(*(a12,2x))")"w", "numeric", "exact", "abs_rerr (%)" 1231 do itau=1,gwr%ntau 1232 eval_int = sum(gwr%tau_wgs(:) * exp(-gwr%tau_mesh(:) * gwr%iw_mesh(itau))) 1233 exact_int = one / gwr%iw_mesh(itau) 1234 abs_rerr = 100 * abs(eval_int - exact_int) / exact_int 1235 write(std_out, "(*(es12.5,2x))") gwr%iw_mesh(itau), eval_int, exact_int, abs_rerr 1236 end do 1237 write(std_out, "(a)") 1238 end if 1239 1240 ! ========================================= 1241 ! Find FFT mesh and max number of g-vectors 1242 ! ========================================= 1243 ! Note the usage of gwr_boxcutmin and the loops over the full BZ. All the procs execute this part. 1244 gwr%g_ngfft = gwr%dtset%ngfft; gwr%g_ngfft(1:6) = 0 ! Allow user to specify fftalg 1245 1246 gwr%green_mpw = -1 1247 do ik_bz=1,gwr%nkbz 1248 kk_bz = gwr%kbz(:, ik_bz) 1249 call get_kg(kk_bz, istwfk1, dtset%ecut, gwr%cryst%gmet, npw_, gvec_) 1250 ABI_FREE(gvec_) 1251 call getng(dtset%gwr_boxcutmin, dtset%chksymtnons, dtset%ecut, cryst%gmet, & 1252 kk_bz, me_fft0, gwr%g_mgfft, gwr%g_nfft, gwr%g_ngfft, nproc_fft1, cryst%nsym, paral_fft0, & 1253 cryst%symrel, cryst%tnons, gpu_option=gwr%dtset%gpu_option, unit=dev_null) 1254 gwr%green_mpw = max(gwr%green_mpw, npw_) 1255 end do 1256 1257 gwr%tchi_mpw = -1 1258 do iq_bz=1,gwr%nqbz 1259 qq_bz = gwr%qbz(:, iq_bz) 1260 call get_kg(qq_bz, istwfk1, dtset%ecuteps, gwr%cryst%gmet, npw_, gvec_) 1261 ABI_FREE(gvec_) 1262 call getng(dtset%gwr_boxcutmin, dtset%chksymtnons, dtset%ecuteps, cryst%gmet, & 1263 qq_bz, me_fft0, gwr%g_mgfft, gwr%g_nfft, gwr%g_ngfft, nproc_fft1, cryst%nsym, & 1264 paral_fft0, cryst%symrel, cryst%tnons, gpu_option=gwr%dtset%gpu_option, unit=dev_null) 1265 gwr%tchi_mpw = max(gwr%tchi_mpw, npw_) 1266 if (iq_bz == 1) then 1267 ABI_CHECK(all(abs(qq_bz) < tol16), "First qpoint in the qbz should be Gamma!") 1268 end if 1269 end do 1270 1271 ! For the time being no augmentation 1272 gwr%g_ngfft(4:6) = gwr%g_ngfft(1:3) 1273 1274 ! ======================== 1275 ! === MPI DISTRIBUTION === 1276 ! ======================== 1277 ! 1278 ! Here we define the following quantities: 1279 ! - np_k, np_g, np_t, np_s 1280 ! - gwr%comm and gwr%idle_proc 1281 ! 1282 ! NB: Do not use input_comm after this section as idle processors return immediately. 1283 1284 if (any(dtset%gwr_np_kgts /= 0)) then 1285 ! Use grid from input file. 1286 np_k = dtset%gwr_np_kgts(1); np_g = dtset%gwr_np_kgts(2); np_t = dtset%gwr_np_kgts(3); np_s = dtset%gwr_np_kgts(4) 1287 !call xmpi_comm_multiple_of(product(dtset%gwr_np_kgts), input_comm, gwr%idle_proc, gwr%comm) 1288 !if (gwr%idle_proc) return 1289 gwr%comm = xcomm_from_mpi_int(input_comm) 1290 all_nproc = gwr%comm%nproc 1291 1292 else 1293 ! Automatic grid generation. 1294 ! 1295 ! Priorities | MPI Scalability | Memory 1296 ! ================================================================================================== 1297 ! spin (if any) | excellent | scales 1298 ! g/r (PBLAS) | network-intensive ! scales 1299 ! tau | excellent | scales 1300 ! kbz | newtwork-intensive | scales (depends on the BZ -> IBZ mapping) 1301 1302 gwr%comm = xcomm_from_mpi_int(input_comm) 1303 all_nproc = gwr%comm%nproc 1304 !call xmpi_comm_multiple_of(gwr%ntau * gwr%dtset%nsppol, input_comm, gwr%idle_proc, gwr%comm) 1305 !if (gwr%idle_proc) return 1306 !all_nproc = xmpi_comm_size(gwr%comm) 1307 1308 ! Start from a configuration that minimizes memory i.e use all procs for g-parallelism, 1309 ! then check whether it's possible to move some procs to the other levels 1310 ! without spoiling parallel efficiency and/or increasing memory per MPI proc. 1311 ! Only master rank works here for consistency reasons. 1312 if (my_rank == master) then 1313 dims_kgts = [1, all_nproc, 1, 1] 1314 est = estimate(gwr, dims_kgts) 1315 prev_efficiency = est%efficiency; prev_speedup = est%speedup 1316 call wrtout(units, sjoin("- Optimizing MPI grid with mem_per_cpu_mb:", ftoa(mem_per_cpu_mb), "[Mb]"), pre_newlines=1) 1317 call wrtout(units, "- Use `abinit run.abi --mem-per-cpu=4G` to set mem_per_cpu_mb in the submission script") 1318 write(msg, "(a,4(a4,2x),3(a12,2x))") "- ", "np_k", "np_g", "np_t", "np_s", "memb_per_cpu", "efficiency", "speedup" 1319 call wrtout(units, msg) 1320 ip_k = dims_kgts(1); ip_g = dims_kgts(2); ip_t = dims_kgts(3); ip_s = dims_kgts(4) 1321 write(msg, "(a,4(i4,2x),3(es12.5,2x))") "- ", ip_k, ip_g, ip_t, ip_s, est%mem_total, est%efficiency, est%speedup 1322 call wrtout(units, msg) 1323 1324 do ip_s=1,gwr%nsppol 1325 do ip_t=1,gwr%ntau 1326 if (mod(gwr%ntau, ip_t) /= 0) cycle ! ip_t should divide gwr%ntau. 1327 do ip_k=1,gwr%nkbz 1328 if (mod(gwr%nkbz, ip_k) /= 0) cycle ! ip_k is should divide gwr%nkbz. 1329 do ip_g=1,gwr%green_mpw 1330 try_dims_kgts = [ip_k, ip_g, ip_t, ip_s] 1331 if (product(try_dims_kgts) /= all_nproc .or. all(try_dims_kgts == dims_kgts)) cycle 1332 !ABI_CHECK(block_dist_1d(npwsp, ip_g, col_bsize, msg), msg) 1333 est = estimate(gwr, try_dims_kgts) 1334 !if (est%mem_total < mem_per_cpu_mb * 0.8_dp .and. est%efficiency > prev_efficiency) then 1335 if (est%mem_total < mem_per_cpu_mb * 0.8_dp .and. est%speedup > prev_speedup) then 1336 prev_efficiency = est%efficiency; prev_speedup = est%speedup; dims_kgts = try_dims_kgts 1337 end if 1338 write(msg,"(a,4(i4,2x),3(es12.5,2x))")"- ", ip_k, ip_g, ip_t, ip_s, est%mem_total, est%efficiency, est%speedup 1339 call wrtout(units, msg) 1340 end do 1341 end do 1342 end do 1343 end do 1344 end if ! master 1345 1346 call xmpi_bcast(dims_kgts, master, gwr%comm%value, ierr) 1347 np_k = dims_kgts(1); np_g = dims_kgts(2); np_t = dims_kgts(3); np_s = dims_kgts(4) 1348 1349 if (my_rank == master) then 1350 est = estimate(gwr, dims_kgts) 1351 call wrtout(units, "-") 1352 call wrtout(units, "- Selected MPI grid:") 1353 ip_k = dims_kgts(1); ip_g = dims_kgts(2); ip_t = dims_kgts(3); ip_s = dims_kgts(4) 1354 write(msg, "(a,4(a4,2x),3(a12,2x))") "- ", "np_k", "np_g", "np_t", "np_s", "memb_per_cpu", "efficiency", "speedup" 1355 call wrtout(units, msg) 1356 write(msg, "(a,4(i4,2x),3(es12.5,2x))")"- ", ip_k, ip_g, ip_t, ip_s, est%mem_total, est%efficiency, est%speedup 1357 call wrtout(units, msg, newlines=1) 1358 call est%print(units) 1359 !call gwr%ps%print([std_out]) 1360 end if 1361 end if 1362 1363 ! ================================ 1364 ! Build MPI grid and communicators 1365 ! ================================ 1366 dims_kgts = [np_k, np_g, np_t, np_s] 1367 gwr%dtset%gwr_np_kgts = dims_kgts 1368 periods(:) = .False.; reorder = .False. 1369 1370 ! Consistency check. 1371 if (product(dims_kgts) /= all_nproc) then 1372 write(msg, "(a,i0,3a, 5(a,1x,i0))") & 1373 "Cannot create 4D Cartesian grid with total nproc: ", all_nproc, ch10, & 1374 "Idle MPI processes are not supported. The product of the `nproc_*` vars should be equal to nproc while is it:", ch10, & 1375 "k_nproc (", np_k, ") x g_nproc (", np_g, ") x tau_nproc (", np_t,") x spin_nproc (", np_s, ") == ", product(dims_kgts) 1376 ABI_ERROR(msg) 1377 end if 1378 1379 #ifdef HAVE_MPI 1380 block 1381 !integer,parameter :: k=1, g=2, t=3, s=4 ! Bad placement 1382 integer,parameter :: k=4, g=3, t=2, s=1 ! Much better placement 1383 dims_kgts = dims_kgts(4:1:-1) 1384 call MPI_CART_CREATE(gwr%comm%value, ndims, dims_kgts, periods, reorder, comm_cart, ierr) 1385 1386 ! Find the index and coordinates of the current processor 1387 call MPI_COMM_RANK(comm_cart, me_cart, ierr) 1388 call MPI_CART_COORDS(comm_cart, me_cart, ndims, gwr%coords_stgk, ierr) 1389 1390 ! k-point communicator 1391 keepdim = .False.; keepdim(k) = .True.; call gwr%kpt_comm%from_cart_sub(comm_cart, keepdim) 1392 ! g-communicator 1393 keepdim = .False.; keepdim(g) = .True.; call gwr%g_comm%from_cart_sub(comm_cart, keepdim) 1394 ! tau-communicator 1395 keepdim = .False.; keepdim(t) = .True.; call gwr%tau_comm%from_cart_sub(comm_cart, keepdim) 1396 ! spin-communicator 1397 keepdim = .False.; keepdim(s) = .True.; call gwr%spin_comm%from_cart_sub(comm_cart, keepdim) 1398 ! Communicator for the g-tau 2D grid. 1399 keepdim = .False.; keepdim(g) = .True.; keepdim(t) = .True.; call gwr%gtau_comm%from_cart_sub(comm_cart, keepdim) 1400 ! Communicator for the k-g 2D grid. 1401 keepdim = .False.; keepdim(k) = .True.; keepdim(g) = .True.; call gwr%kg_comm%from_cart_sub(comm_cart, keepdim) 1402 ! Communicator for the k-g-tau 3D subgrid. 1403 keepdim = .True.; keepdim(s) = .False.; call gwr%kgt_comm%from_cart_sub(comm_cart, keepdim) 1404 ! Communicator for the k-tau-spin 3D subgrid. 1405 keepdim = .True.; keepdim(g) = .False.; call gwr%kts_comm%from_cart_sub(comm_cart, keepdim) 1406 call xmpi_comm_free(comm_cart) 1407 end block 1408 #endif 1409 1410 !call gwr%kpt_comm%print_names(); call gwr%g_comm%print_names() 1411 1412 ! Define batch sizes for FFT transforms taking into account k-point parallelism, OpenMP threads and GPUs. 1413 omp_nt = xomp_get_num_threads(open_parallel=.True.) 1414 1415 if (gwr%dtset%gwr_ucsc_batch(1) > 0) then 1416 ! Take it from input file (user is always right) 1417 gwr%uc_batch_size = gwr%dtset%gwr_ucsc_batch(1) * omp_nt 1418 else 1419 ! Automatic detection 1420 gwr%uc_batch_size = 1 * omp_nt 1421 if (gwr%dtset%gpu_option /= ABI_GPU_DISABLED) gwr%uc_batch_size = 4 * omp_nt 1422 end if 1423 1424 if (gwr%dtset%gwr_ucsc_batch(2) > 0) then 1425 ! Take it from input file (user is always right) 1426 gwr%sc_batch_size = gwr%dtset%gwr_ucsc_batch(2) * omp_nt 1427 else 1428 ! Automatic detection 1429 gwr%sc_batch_size = 1 * omp_nt 1430 if (gwr%dtset%gpu_option /= ABI_GPU_DISABLED) gwr%sc_batch_size = 4 * omp_nt 1431 end if 1432 1433 ! Make sure all procs agree. 1434 !call xmpi_min_ip(gwr%sc_batch_size, gwr%comm%value, ierr) 1435 !call xmpi_min_ip(gwr%uc_batch_size, gwr%comm%value, ierr) 1436 1437 if (my_rank == master) then 1438 call print_ngfft(gwr%g_ngfft, header="FFT mesh for Green's function", unit=std_out) 1439 !call print_ngfft(gwr%g_ngfft, header="FFT mesh for Green's function", unit=ab_out) 1440 call wrtout(units, sjoin("- FFT uc_batch_size:", itoa(gwr%uc_batch_size))) 1441 call wrtout(units, sjoin("- FFT sc_batch_size:", itoa(gwr%sc_batch_size))) 1442 end if 1443 1444 ! Block-distribute dimensions and allocate redirection table local index --> global index. 1445 call xmpi_split_block(gwr%ntau, gwr%tau_comm%value, gwr%my_ntau, gwr%my_itaus) 1446 ABI_CHECK(gwr%my_ntau > 0, "my_ntau == 0, decrease number of procs for tau level") 1447 1448 ! Store the rank of the MPI proc in tau_comm treating itau index. 1449 ABI_MALLOC(gwr%tau_master, (gwr%ntau)) 1450 gwr%tau_master = -1 1451 do my_it=1,gwr%my_ntau 1452 itau = gwr%my_itaus(my_it); gwr%tau_master(itau) = gwr%tau_comm%me 1453 end do 1454 call xmpi_max_ip(gwr%tau_master, gwr%tau_comm%value, ierr) 1455 ABI_CHECK(all(gwr%tau_master > -1), "tau_master!") 1456 1457 call xmpi_split_block(gwr%nsppol, gwr%spin_comm%value, gwr%my_nspins, gwr%my_spins) 1458 ABI_CHECK(gwr%my_nspins > 0, "my_nspins == 0, decrease number of MPI procs for spin level") 1459 1460 ! Distribute k-points in the full BZ and build redirection tables. 1461 ! Finally, find the number of IBZ k-points treated by this MPI rank. 1462 call xmpi_split_block(gwr%nkbz, gwr%kpt_comm%value, gwr%my_nkbz, gwr%my_kbz_inds) 1463 ABI_CHECK(gwr%my_nkbz > 0, "my_nkbz == 0, decrease number of MPI procs for k-point level") 1464 1465 ! Compute np_kibz 1466 ABI_ICALLOC(gwr%np_kibz, (gwr%nkibz)) 1467 do my_ikf=1,gwr%my_nkbz 1468 ik_bz = gwr%my_kbz_inds(my_ikf); ik_ibz = gwr%kbz2ibz(1, ik_bz) 1469 gwr%np_kibz(ik_ibz) = 1 1470 end do 1471 1472 gwr%my_nkibz = count(gwr%np_kibz > 0) 1473 ABI_MALLOC(gwr%my_kibz_inds, (gwr%my_nkibz)) 1474 ii = 0 1475 do ik_ibz=1,gwr%nkibz 1476 if (gwr%np_kibz(ik_ibz) > 0) then 1477 ii = ii + 1; gwr%my_kibz_inds(ii) = ik_ibz 1478 end if 1479 end do 1480 1481 call xmpi_sum(gwr%np_kibz, gwr%kpt_comm%value, ierr) 1482 1483 ! Build table to distribute iterations over ik_ibz as kIBZ might be replicated across MPI procs. 1484 ABI_ICALLOC(iwork, (gwr%kpt_comm%nproc, gwr%nkibz)) 1485 ABI_ICALLOC(got, (gwr%kpt_comm%nproc)) 1486 do my_iki=1,gwr%my_nkibz 1487 ik_ibz = gwr%my_kibz_inds(my_iki) 1488 iwork(gwr%kpt_comm%me + 1, ik_ibz) = 1 1489 end do 1490 call xmpi_sum(iwork, gwr%kpt_comm%value, ierr) 1491 ABI_MALLOC(gwr%itreat_ikibz, (gwr%nkibz)) 1492 gwr%itreat_ikibz = .False. 1493 do ik_ibz=1,gwr%nkibz 1494 ii = imin_loc(got, mask=iwork(:, ik_ibz) /= 0); got(ii) = got(ii) + 1 1495 if (ii == gwr%kpt_comm%me + 1) gwr%itreat_ikibz(ik_ibz) = .True. 1496 end do 1497 ABI_FREE(got) 1498 ABI_FREE(iwork) 1499 1500 ! Distribute q-points in full BZ, transfer symmetry tables. 1501 ! Finally find the number of my IBZ q-points that should be stored in memory. 1502 call xmpi_split_block(gwr%nqbz, gwr%kpt_comm%value, gwr%my_nqbz, gwr%my_qbz_inds) 1503 1504 ! Compute np_qibz 1505 ABI_ICALLOC(gwr%np_qibz, (gwr%nqibz)) 1506 do my_iqf=1,gwr%my_nqbz 1507 iq_bz = gwr%my_qbz_inds(my_iqf); iq_ibz = gwr%qbz2ibz(1, iq_bz) 1508 gwr%np_qibz(iq_ibz) = 1 1509 end do 1510 1511 gwr%my_nqibz = count(gwr%np_qibz > 0) 1512 ABI_MALLOC(gwr%my_qibz_inds, (gwr%my_nqibz)) 1513 ii = 0 1514 do iq_ibz=1,gwr%nqibz 1515 if (gwr%np_qibz(iq_ibz) > 0) then 1516 ii = ii + 1; gwr%my_qibz_inds(ii) = iq_ibz 1517 end if 1518 end do 1519 1520 call xmpi_sum(gwr%np_qibz, gwr%kpt_comm%value, ierr) 1521 1522 ! Build table to distribute iterations over iq_ibz as qIBZ might be replicated. 1523 ABI_ICALLOC(iwork, (gwr%kpt_comm%nproc, gwr%nqibz)) 1524 ABI_ICALLOC(got, (gwr%kpt_comm%nproc)) 1525 do my_iqi=1,gwr%my_nqibz 1526 iq_ibz = gwr%my_qibz_inds(my_iqi) 1527 iwork(gwr%kpt_comm%me + 1, iq_ibz) = 1 1528 end do 1529 call xmpi_sum(iwork, gwr%kpt_comm%value, ierr) 1530 1531 ABI_MALLOC(gwr%itreat_iqibz, (gwr%nqibz)) 1532 gwr%itreat_iqibz = .False. 1533 do iq_ibz=1,gwr%nqibz 1534 ii = imin_loc(got, mask=iwork(:, iq_ibz) /= 0); got(ii) = got(ii) + 1 1535 if (ii == gwr%kpt_comm%me + 1) gwr%itreat_iqibz(iq_ibz) = .True. 1536 end do 1537 ABI_FREE(got) 1538 ABI_FREE(iwork) 1539 1540 ! TODO: MC technique does not seem to work as expected, even in the legacy code. 1541 vc_ecut = max(dtset%ecutsigx, dtset%ecuteps) 1542 call gwr%vcgen%init(cryst, ks_ebands%kptrlatt, gwr%nkbz, gwr%nqibz, gwr%nqbz, gwr%qbz, & 1543 dtset%rcut, dtset%gw_icutcoul, dtset%vcutgeo, vc_ecut, gwr%comm%value) 1544 1545 ! Now we know the value of g_ngfft. Setup tables for zero-padded FFTs. 1546 ! Build descriptors for Green's functions and tchi and setup tables for zero-padded FFTs. 1547 ABI_MALLOC(gwr%green_desc_kibz, (gwr%nkibz)) 1548 1549 do my_iki=1,gwr%my_nkibz 1550 ik_ibz = gwr%my_kibz_inds(my_iki); kk_ibz = gwr%kibz(:, ik_ibz) 1551 call gwr%green_desc_kibz(ik_ibz)%init(kk_ibz, istwfk1, dtset%ecut, gwr) 1552 end do 1553 1554 ABI_MALLOC(gwr%tchi_desc_qibz, (gwr%nqibz)) 1555 ABI_ICALLOC(gwr%chinpw_qibz, (gwr%nqibz)) 1556 1557 do my_iqi=1,gwr%my_nqibz 1558 iq_ibz = gwr%my_qibz_inds(my_iqi); qq_ibz = gwr%qibz(:, iq_ibz) 1559 ! Note ecuteps instead of ecut. Also, sort the g-vectors by |q+g|^2/2 when q is in the IBZ to facilitate 1560 ! the extrapolation of the RPA energy as a function of ecut_chi 1561 call gwr%tchi_desc_qibz(iq_ibz)%init(qq_ibz, istwfk1, dtset%ecuteps, gwr, kin_sorted=.True.) 1562 1563 ! Compute sqrt(vc(q,G)) 1564 associate (desc_q => gwr%tchi_desc_qibz(iq_ibz)) 1565 if (gwr%itreat_iqibz(iq_ibz)) gwr%chinpw_qibz(iq_ibz) = desc_q%npw 1566 q_is_gamma = (normv(qq_ibz, gwr%cryst%gmet, "G") < GW_TOLQ0) 1567 call desc_q%get_vc_sqrt(qq_ibz, q_is_gamma, gwr, gwr%gtau_comm%value) 1568 end associate 1569 end do 1570 1571 ! Collect npwq on all procs 1572 call xmpi_sum(gwr%chinpw_qibz, gwr%comm%value, ierr) 1573 1574 ! Init 1D PBLAS grid to block-distribute matrices along columns. 1575 call gwr%g_slkproc%init(gwr%g_comm%value, grid_dims=[1, gwr%g_comm%nproc]) 1576 call gwr%gtau_slkproc%init(gwr%gtau_comm%value, grid_dims=[1, gwr%gtau_comm%nproc]) 1577 1578 ! ================================== 1579 ! Allocate arrays of PBLAS matrices 1580 ! ================================== 1581 ABI_MALLOC(gwr%gt_kibz, (2, gwr%nkibz, gwr%ntau, gwr%nsppol)) 1582 ABI_MALLOC(gwr%tchi_qibz, (gwr%nqibz, gwr%ntau, gwr%nsppol)) 1583 ABI_MALLOC(gwr%sigc_kibz, (2, gwr%nkibz, gwr%ntau, gwr%nsppol)) 1584 1585 ! ==================================== 1586 ! Create netcdf file to store results 1587 ! ==================================== 1588 gwr%gwrnc_path = strcat(dtfil%filnam_ds(4), "_GWR.nc") 1589 1590 if (my_rank == master) then 1591 call gwr%print(units) 1592 NCF_CHECK(nctk_open_create(ncid, gwr%gwrnc_path, xmpi_comm_self)) 1593 ! Write structure and ebands 1594 NCF_CHECK(cryst%ncwrite(ncid)) 1595 NCF_CHECK(ebands_ncwrite(ks_ebands, ncid)) 1596 1597 ! Add GWR dimensions. 1598 smat_bsize1 = gwr%b2gw - gwr%b1gw + 1 1599 smat_bsize2 = merge(1, gwr%b2gw - gwr%b1gw + 1, gwr%sig_diago) 1600 ncerr = nctk_def_dims(ncid, [ & 1601 nctkdim_t("nsppol", gwr%nsppol), nctkdim_t("ntau", gwr%ntau), nctkdim_t("nwr", gwr%nwr), & 1602 nctkdim_t("chi_mpw", gwr%tchi_mpw), nctkdim_t("nqibz", gwr%nqibz), nctkdim_t("nqbz", gwr%nqbz), & 1603 nctkdim_t("nkcalc", gwr%nkcalc), nctkdim_t("max_nbcalc", gwr%max_nbcalc), & 1604 nctkdim_t("smat_bsize1", smat_bsize1), nctkdim_t("smat_bsize2", smat_bsize2) & 1605 ], defmode=.True.) 1606 NCF_CHECK(ncerr) 1607 1608 ncerr = nctk_def_iscalars(ncid, [character(len=nctk_slen) :: & 1609 "sig_diago", "b1gw", "b2gw", "symsigma", "symchi", "scf_iteration" & 1610 ]) 1611 NCF_CHECK(ncerr) 1612 1613 ncerr = nctk_def_dpscalars(ncid, [character(len=nctk_slen) :: & 1614 "wr_step", "ecuteps", "ecut", "ecutsigx", "gwr_boxcutmin", & 1615 "min_transition_energy_eV", "max_transition_energy_eV", "eratio", & 1616 "ft_max_err_t2w_cos", "ft_max_err_w2t_cos", "ft_max_err_t2w_sin", "cosft_duality_error", "regterm" & 1617 ]) 1618 NCF_CHECK(ncerr) 1619 1620 ! Define arrays with results. 1621 ncerr = nctk_def_arrays(ncid, [ & 1622 nctkarr_t("gwr_task", "char", "character_string_length"), & 1623 nctkarr_t("tau_mesh", "dp", "ntau"), & 1624 nctkarr_t("tau_wgs", "dp", "ntau"), & 1625 nctkarr_t("iw_mesh", "dp", "ntau"), & 1626 nctkarr_t("iw_wgs", "dp", "ntau"), & 1627 nctkarr_t("cosft_wt", "dp", "ntau, ntau"), & 1628 nctkarr_t("cosft_tw", "dp", "ntau, ntau"), & 1629 nctkarr_t("sinft_wt", "dp", "ntau, ntau"), & 1630 !nctkarr_t("ngqpt", "int", "three"), & 1631 nctkarr_t("bstart_ks", "int", "nkcalc, nsppol"), & 1632 nctkarr_t("bstop_ks", "int", "nkcalc, nsppol"), & 1633 nctkarr_t("kcalc", "dp", "three, nkcalc"), & 1634 nctkarr_t("kcalc2ibz", "int", "nkcalc, six") & 1635 ]) 1636 NCF_CHECK(ncerr) 1637 1638 ! ====================================================== 1639 ! Write data that do not depend on the (kpt, spin) loop. 1640 ! ====================================================== 1641 NCF_CHECK(nctk_set_datamode(ncid)) 1642 ncerr = nctk_write_iscalars(ncid, [character(len=nctk_slen) :: & 1643 "sig_diago", "b1gw", "b2gw", "symsigma", "symchi", "scf_iteration"], & 1644 [merge(1, 0, gwr%sig_diago), gwr%b1gw, gwr%b2gw, gwr%dtset%symsigma, dtset%symchi, gwr%scf_iteration]) 1645 NCF_CHECK(ncerr) 1646 1647 ncerr = nctk_write_dpscalars(ncid, [character(len=nctk_slen) :: & 1648 "wr_step", "ecuteps", "ecut", "ecutsigx", "gwr_boxcutmin", & 1649 "min_transition_energy_eV", "max_transition_energy_eV", "eratio", & 1650 "ft_max_err_t2w_cos", "ft_max_err_w2t_cos", "ft_max_err_t2w_sin", "cosft_duality_error", "regterm"], & 1651 [gwr%wr_step, dtset%ecuteps, dtset%ecut, dtset%ecutsigx, dtset%gwr_boxcutmin, & 1652 gwr%te_min, gwr%te_max, gwr%te_max / gwr%te_min, & 1653 gwr%ft_max_error(1), gwr%ft_max_error(2), gwr%ft_max_error(3), gwr%cosft_duality_error, regterm & 1654 ]) 1655 NCF_CHECK(ncerr) 1656 1657 NCF_CHECK(nf90_put_var(ncid, vid("gwr_task"), trim(dtset%gwr_task))) 1658 NCF_CHECK(nf90_put_var(ncid, vid("tau_mesh"), gwr%tau_mesh)) 1659 NCF_CHECK(nf90_put_var(ncid, vid("tau_wgs"), gwr%tau_wgs)) 1660 NCF_CHECK(nf90_put_var(ncid, vid("iw_mesh"), gwr%iw_mesh)) 1661 NCF_CHECK(nf90_put_var(ncid, vid("iw_wgs"), gwr%iw_wgs)) 1662 NCF_CHECK(nf90_put_var(ncid, vid("cosft_wt"), gwr%cosft_wt)) 1663 NCF_CHECK(nf90_put_var(ncid, vid("cosft_tw"), gwr%cosft_tw)) 1664 NCF_CHECK(nf90_put_var(ncid, vid("sinft_wt"), gwr%sinft_wt)) 1665 NCF_CHECK(nf90_put_var(ncid, vid("bstart_ks"), gwr%bstart_ks)) 1666 NCF_CHECK(nf90_put_var(ncid, vid("bstop_ks"), gwr%bstop_ks)) 1667 NCF_CHECK(nf90_put_var(ncid, vid("kcalc"), gwr%kcalc)) 1668 NCF_CHECK(nf90_put_var(ncid, vid("kcalc2ibz"), gwr%kcalc2ibz)) 1669 NCF_CHECK(nf90_close(ncid)) 1670 end if ! master 1671 1672 call cwtime_report(" gwr_init:", cpu, wall, gflops) 1673 call timab(1920, 2, tsec) 1674 1675 contains 1676 integer function vid(vname) 1677 character(len=*),intent(in) :: vname 1678 vid = nctk_idname(ncid, vname) 1679 end function vid 1680 1681 end subroutine gwr_init
m_gwr/gwr_load_kcalc_wfd [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_load_kcalc_wfd
FUNCTION
Load the KS states to compute Sigma_nk from the WFK file
INPUTS
OUTPUT
SOURCE
2034 subroutine gwr_load_kcalc_wfd(gwr, wfk_path, tmp_kstab) 2035 2036 !Arguments ------------------------------------ 2037 class(gwr_t),intent(inout) :: gwr 2038 character(len=*),intent(in) :: wfk_path 2039 integer,allocatable,intent(out) :: tmp_kstab(:,:,:) 2040 2041 !Local variables------------------------------- 2042 !scalars 2043 integer :: mband, nkibz, nsppol, spin, ik_ibz, ikcalc 2044 real(dp) :: cpu, wall, gflops 2045 !character(len=5000) :: msg 2046 type(ebands_t) :: ks_ebands 2047 type(hdr_type) :: wfk_hdr 2048 !arrays 2049 integer,allocatable :: nband(:,:), wfd_istwfk(:) 2050 logical,allocatable :: bks_mask(:,:,:), keep_ur(:,:,:) 2051 2052 ! ************************************************************************* 2053 2054 call cwtime(cpu, wall, gflops, "start") 2055 2056 associate (wfd => gwr%kcalc_wfd, dtset => gwr%dtset) 2057 2058 ks_ebands = wfk_read_ebands(wfk_path, gwr%comm%value, out_hdr=wfk_hdr) 2059 call wfk_hdr%vs_dtset(dtset) 2060 2061 ! TODO: Add more consistency checks e.g. nkibz,... 2062 !cryst = wfk_hdr%get_crystal() 2063 !call cryst%print(header="crystal structure from WFK file") 2064 2065 nkibz = ks_ebands%nkpt; nsppol = ks_ebands%nsppol 2066 2067 ! Don't take mband from ks_ebands but compute it from gwr%bstop_ks 2068 mband = maxval(gwr%bstop_ks) !; mband = ks_ebands%mband 2069 2070 ! Initialize the wave function descriptor. 2071 ! Only wavefunctions for the symmetrical imagine of the k wavevectors 2072 ! treated by this MPI rank are stored. 2073 ABI_MALLOC(nband, (nkibz, nsppol)) 2074 ABI_MALLOC(bks_mask, (mband, nkibz, nsppol)) 2075 ABI_MALLOC(keep_ur, (mband, nkibz, nsppol)) 2076 nband = mband; bks_mask = .False.; keep_ur = .False. 2077 2078 ABI_ICALLOC(tmp_kstab, (2, nkibz, nsppol)) 2079 2080 do spin=1,gwr%nsppol 2081 do ikcalc=1,gwr%nkcalc ! TODO: Should be spin dependent! 2082 ik_ibz = gwr%kcalc2ibz(ikcalc, 1) 2083 associate (b1 => gwr%bstart_ks(ikcalc, spin), b2 => gwr%bstop_ks(ikcalc, spin)) 2084 tmp_kstab(:, ik_ibz, spin) = [b1, b2] 2085 bks_mask(b1:b2, ik_ibz, spin) = .True. 2086 end associate 2087 end do 2088 end do 2089 2090 ! Impose istwfk = 1 for all k-points. 2091 ! wfd_read_wfk will handle a possible conversion if the WFK contains istwfk /= 1. 2092 ABI_MALLOC(wfd_istwfk, (nkibz)) 2093 wfd_istwfk = 1 2094 2095 call wfd_init(wfd, gwr%cryst, gwr%pawtab, gwr%psps, keep_ur, mband, nband, nkibz, dtset%nsppol, bks_mask, & 2096 dtset%nspden, dtset%nspinor, dtset%ecut, dtset%ecutsm, dtset%dilatmx, wfd_istwfk, ks_ebands%kptns, gwr%g_ngfft, & 2097 dtset%nloalg, dtset%prtvol, dtset%pawprtvol, gwr%comm%value) 2098 2099 call wfd%print(header="Wavefunctions for GWR calculation") 2100 2101 ABI_FREE(nband) 2102 ABI_FREE(keep_ur) 2103 ABI_FREE(wfd_istwfk) 2104 ABI_FREE(bks_mask) 2105 2106 call ebands_free(ks_ebands) 2107 call wfk_hdr%free() 2108 2109 ! Read KS wavefunctions. 2110 call wfd%read_wfk(wfk_path, iomode_from_fname(wfk_path)) 2111 end associate 2112 2113 call cwtime_report(" gwr_load_kcalc_from_wfk:", cpu, wall, gflops) 2114 !call gwr%pstat%print([std_out], reload=.True.) 2115 2116 end subroutine gwr_load_kcalc_wfd
m_gwr/gwr_malloc_free_mats [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_malloc_free_mats
FUNCTION
Allocate/Free PBLAS matrices according to `what` for the set of k/q-points selected by `mask_ibz`.
SOURCE
1793 subroutine gwr_malloc_free_mats(gwr, mask_ibz, what, action) 1794 1795 !Arguments ------------------------------------ 1796 class(gwr_t), target, intent(inout) :: gwr 1797 integer,intent(in) :: mask_ibz(:) 1798 character(len=*),intent(in) :: what, action 1799 1800 !Local variables------------------------------- 1801 integer :: my_is, my_it, ipm, npwsp, col_bsize, itau, spin, ik_ibz, iq_ibz 1802 !integer :: ii, num_pm, ipm_list__(2) 1803 type(__slkmat_t), pointer :: mat 1804 character(len=500) :: msg 1805 1806 ! ************************************************************************* 1807 1808 ABI_CHECK(string_in(action, "malloc, free"), sjoin("Invalid action:", action)) 1809 1810 !num_pm = 2; ipm_list__ = [1, 2] 1811 !if (present(ipm_list)) then 1812 ! num_pm = size(ipm_list) 1813 ! ABI_CHECK_IRANGE(num_pm, 1, 2, "num_pm not in [1, 2]") 1814 ! ipm_list__(1:num_pm) = ipm_list(:) 1815 !end if 1816 1817 do my_is=1,gwr%my_nspins 1818 spin = gwr%my_spins(my_is) 1819 do my_it=1,gwr%my_ntau 1820 itau = gwr%my_itaus(my_it) 1821 ! All the PBLAS matrices are MPI distributed over g' in blocks 1822 1823 select case (what) 1824 case ("green") 1825 ! ======================== 1826 ! Allocate/free G_k(g,g') 1827 ! ======================== 1828 ABI_CHECK_IEQ(size(mask_ibz), gwr%nkibz, "wrong mask size") 1829 1830 do ik_ibz=1,gwr%nkibz 1831 if (mask_ibz(ik_ibz) == 0) cycle 1832 npwsp = gwr%green_desc_kibz(ik_ibz)%npw * gwr%nspinor 1833 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 1834 associate (gt => gwr%gt_kibz(:, ik_ibz, itau, spin)) 1835 do ipm=1,2 1836 if (action == "malloc") call gt(ipm)%init(npwsp, npwsp, gwr%g_slkproc, istwfk1, size_blocs=[-1, col_bsize]) 1837 if (action == "free") call gt(ipm)%free() 1838 end do 1839 end associate 1840 end do 1841 1842 case ("tchi", "wc") 1843 ! =========================== 1844 ! Allocate/free tchi_q(g,g') 1845 ! =========================== 1846 ABI_CHECK_IEQ(size(mask_ibz), gwr%nqibz, "wrong mask size") 1847 1848 do iq_ibz=1,gwr%nqibz 1849 if (mask_ibz(iq_ibz) == 0) cycle 1850 npwsp = gwr%tchi_desc_qibz(iq_ibz)%npw * gwr%nspinor 1851 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 1852 if (what == "tchi") mat => gwr%tchi_qibz(iq_ibz, itau, spin) 1853 if (what == "wc") mat => gwr%wc_qibz(iq_ibz, itau, spin) 1854 if (action == "malloc") call mat%init(npwsp, npwsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 1855 if (action == "free") call mat%free() 1856 end do 1857 1858 case ("sigma") 1859 ! ================================ 1860 ! Allocate/free sigmac_kibz(g,g') 1861 ! ================================ 1862 ABI_CHECK_IEQ(size(mask_ibz), gwr%nkibz, "wrong mask size") 1863 do ik_ibz=1,gwr%nkibz 1864 if (mask_ibz(ik_ibz) == 0) cycle 1865 npwsp = gwr%tchi_desc_qibz(iq_ibz)%npw * gwr%nspinor 1866 ABI_CHECK(block_dist_1d(npwsp, gwr%g_comm%nproc, col_bsize, msg), msg) 1867 associate (sigc => gwr%sigc_kibz(:, ik_ibz, itau, spin)) 1868 do ipm=1,2 1869 if (action == "malloc") call sigc(ipm)%init(npwsp, npwsp, gwr%g_slkproc, 1, size_blocs=[-1, col_bsize]) 1870 if (action == "free") call sigc(ipm)%free() 1871 end do 1872 end associate 1873 end do 1874 1875 case default 1876 ABI_ERROR(sjoin("Invalid what:", what)) 1877 end select 1878 1879 end do ! my_it 1880 end do ! my_is 1881 1882 call wrtout(std_out, "") 1883 call gwr%print_mem(unit=std_out) 1884 1885 end subroutine gwr_malloc_free_mats
m_gwr/gwr_ncwrite_tchi_wc [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_ncwrite_tchi_wc
FUNCTION
Write tchi or wc to netcdf file
INPUTS
OUTPUT
SOURCE
6582 subroutine gwr_ncwrite_tchi_wc(gwr, what, filepath) 6583 6584 !Arguments ------------------------------------ 6585 class(gwr_t),target,intent(in) :: gwr 6586 character(len=*),intent(in) :: what, filepath 6587 6588 !Local variables------------------------------- 6589 !scalars 6590 integer,parameter :: master = 0 6591 integer :: my_is, my_iqi, my_it, spin, iq_ibz, itau, npwtot_q, my_ncols, my_gcol_start, ncid, ncerr !, ierr 6592 real(dp) :: cpu, wall, gflops 6593 !arrays 6594 real(dp), ABI_CONTIGUOUS pointer :: fptr(:,:,:) 6595 type(__slkmat_t), pointer :: mats(:) 6596 6597 ! ************************************************************************* 6598 6599 ! Cannot reuse SCR.nc/SUSC.nc fileformat as: 6600 ! - hscr_new requires ep% 6601 ! - old file formats assume Gamma-centered g vectors. 6602 6603 call cwtime(cpu, wall, gflops, "start") 6604 6605 if (gwr%comm%me == master) then 6606 call wrtout(std_out, sjoin(" Writing", what, "to:", filepath)) 6607 NCF_CHECK(nctk_open_create(ncid, filepath, xmpi_comm_self)) 6608 NCF_CHECK(gwr%cryst%ncwrite(ncid)) 6609 6610 ! Add dimensions. 6611 ncerr = nctk_def_dims(ncid, [ & 6612 nctkdim_t("nsppol", gwr%nsppol), nctkdim_t("ntau", gwr%ntau), nctkdim_t("mpw", gwr%tchi_mpw), & 6613 nctkdim_t("nqibz", gwr%nqibz), nctkdim_t("nqbz", gwr%nqbz)], & 6614 defmode=.True.) 6615 NCF_CHECK(ncerr) 6616 6617 ! Define arrays with results. 6618 ! TODO: Add metadata for mats: spin sum, vc cutoff, t/w mesh, handle nspinor 2 6619 ncerr = nctk_def_arrays(ncid, [ & 6620 nctkarr_t("ngkpt", "int", "three"), & 6621 nctkarr_t("ngqpt", "int", "three"), & 6622 nctkarr_t("qibz", "dp", "three, nqibz"), & 6623 nctkarr_t("wtq", "dp", "nqibz"), & 6624 nctkarr_t("tau_mesh", "dp", "ntau"), & 6625 nctkarr_t("tau_wgs", "dp", "ntau"), & 6626 nctkarr_t("iw_mesh", "dp", "ntau"), & 6627 nctkarr_t("iw_wgs", "dp", "ntau"), & 6628 nctkarr_t("gvecs", "int", "three, mpw, nqibz"), & 6629 nctkarr_t("chinpw_qibz", "int", "nqibz"), & 6630 nctkarr_t("mats", "dp", "two, mpw, mpw, ntau, nqibz, nsppol") & 6631 ]) 6632 NCF_CHECK(ncerr) 6633 6634 ! Write global arrays. 6635 NCF_CHECK(nctk_set_datamode(ncid)) 6636 NCF_CHECK(nf90_put_var(ncid, vid("ngkpt"), gwr%ngkpt)) 6637 NCF_CHECK(nf90_put_var(ncid, vid("ngqpt"), gwr%ngqpt)) 6638 NCF_CHECK(nf90_put_var(ncid, vid("qibz"), gwr%qibz)) 6639 NCF_CHECK(nf90_put_var(ncid, vid("wtq"), gwr%wtq)) 6640 NCF_CHECK(nf90_put_var(ncid, vid("tau_mesh"), gwr%tau_mesh)) 6641 NCF_CHECK(nf90_put_var(ncid, vid("tau_wgs"), gwr%tau_wgs)) 6642 NCF_CHECK(nf90_put_var(ncid, vid("iw_mesh"), gwr%iw_mesh)) 6643 NCF_CHECK(nf90_put_var(ncid, vid("iw_wgs"), gwr%iw_wgs)) 6644 NCF_CHECK(nf90_put_var(ncid, vid("chinpw_qibz"), gwr%chinpw_qibz)) 6645 NCF_CHECK(nf90_close(ncid)) 6646 end if 6647 6648 call xmpi_barrier(gwr%comm%value) 6649 6650 ! Reopen the file in gwr%comm. 6651 NCF_CHECK(nctk_open_modify(ncid, filepath, gwr%comm%value)) 6652 6653 do my_is=1,gwr%my_nspins 6654 spin = gwr%my_spins(my_is) 6655 do my_iqi=1,gwr%my_nqibz 6656 iq_ibz = gwr%my_qibz_inds(my_iqi) 6657 6658 ! The same q-point in the IBZ might be stored on different pools. 6659 ! To avoid writing the same array multiple times, we use itreat_qibz 6660 ! to select the procs inside gwr%kpt_comm who are gonna write this iq_ibz q-point. 6661 if (.not. gwr%itreat_iqibz(iq_ibz)) cycle 6662 6663 associate (desc_q => gwr%tchi_desc_qibz(iq_ibz)) 6664 npwtot_q = desc_q%npw 6665 6666 if (spin == 1 .and. gwr%gtau_comm%me == 0) then 6667 ! Write all G-vectors for this q 6668 NCF_CHECK(nf90_put_var(ncid, vid("gvecs"), desc_q%gvec, start=[1,1,iq_ibz], count=[3,npwtot_q,1])) 6669 end if 6670 6671 mats => null() 6672 if (what == "tchi") mats => gwr%tchi_qibz(iq_ibz, :, spin) 6673 if (what == "wc") mats => gwr%wc_qibz(iq_ibz, :, spin) 6674 ABI_CHECK(associated(mats), sjoin("Invalid value for what:", what)) 6675 6676 do my_it=1,gwr%my_ntau 6677 itau = gwr%my_itaus(my_it) 6678 6679 ! FIXME: Assuming PBLAS matrix distributed in contiguous blocks along the column index. 6680 ! This part must be changed if we use round robin distribution. 6681 my_ncols = mats(itau)%sizeb_local(2) 6682 my_gcol_start = mats(itau)%loc2gcol(1) 6683 6684 ! FIXME: This is wrong if spc 6685 !call c_f_pointer(c_loc(mats(itau)%buffer_cplx), fptr, shape=[2, npwtot_q, my_ncols]) 6686 ABI_MALLOC(fptr, (2, npwtot_q, my_ncols)) 6687 fptr(1,:,:) = dble(mats(itau)%buffer_cplx) 6688 fptr(2,:,:) = aimag(mats(itau)%buffer_cplx) 6689 6690 ncerr = nf90_put_var(ncid, vid("mats"), fptr, & 6691 start=[1, 1, my_gcol_start, itau, iq_ibz, spin], & 6692 count=[2, npwtot_q, my_ncols, 1, 1, 1]) 6693 !stride=[1, gwr%g_comm%nproc, 1, 1, 1]) 6694 ABI_FREE(fptr) 6695 NCF_CHECK(ncerr) 6696 end do 6697 end associate 6698 end do ! my_iqi 6699 end do ! my_is 6700 6701 NCF_CHECK(nf90_close(ncid)) 6702 call cwtime_report(" gwr_ncwrite_tchi_wc:", cpu, wall, gflops) 6703 6704 contains 6705 integer function vid(vname) 6706 character(len=*),intent(in) :: vname 6707 vid = nctk_idname(ncid, vname) 6708 end function vid 6709 6710 end subroutine gwr_ncwrite_tchi_wc
m_gwr/gwr_print [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_print
FUNCTION
Print info on the gwr object.
INPUTS
SOURCE
3913 subroutine gwr_print(gwr, units, header) 3914 3915 !Arguments ------------------------------------ 3916 class(gwr_t),intent(in) :: gwr 3917 integer,intent(in) :: units(:) 3918 character(len=*),optional,intent(in) :: header 3919 3920 !Local variables------------------------------- 3921 integer :: ii 3922 character(len=500) :: msg 3923 type(yamldoc_t) :: ydoc 3924 3925 ! ********************************************************************* 3926 3927 msg = ' ==== Info on the gwr_t object ==== '; if (present(header)) msg=' ==== '//trim(adjustl(header))//' ==== ' 3928 call wrtout(units, msg) 3929 3930 ydoc = yamldoc_open('GWR_params') !, width=11, real_fmt='(3f8.3)') 3931 call ydoc%add_string("gwr_task", gwr%dtset%gwr_task) 3932 call ydoc%add_int("nband", gwr%dtset%nband(1)) 3933 call ydoc%add_int("ntau", gwr%ntau) 3934 call ydoc%add_int1d("ngkpt", gwr%ngkpt) 3935 call ydoc%add_int1d("ngqpt", gwr%ngqpt) 3936 msg = "supercell"; if (.not. gwr%use_supercell_for_tchi) msg = "BZ-convolutions" 3937 call ydoc%add_string("chi_algo", msg) 3938 msg = "supercell"; if (.not. gwr%use_supercell_for_sigma) msg = "BZ-convolutions" 3939 call ydoc%add_string("sigma_algo", msg) 3940 call ydoc%add_int("nkibz", gwr%nkibz) 3941 call ydoc%add_int("nqibz", gwr%nqibz) 3942 call ydoc%add_int("inclvkb", gwr%dtset%inclvkb) 3943 call ydoc%add_real1d("q0", gwr%q0) ! "for long-wavelenght limit")) 3944 call ydoc%add_int("gw_icutcoul", gwr%dtset%gw_icutcoul) 3945 call ydoc%add_int("green_mpw", gwr%green_mpw) 3946 call ydoc%add_int("tchi_mpw", gwr%tchi_mpw) 3947 call ydoc%add_int1d("g_ngfft", gwr%g_ngfft(1:6)) 3948 call ydoc%add_real("gwr_boxcutmin", gwr%dtset%gwr_boxcutmin) 3949 call ydoc%add_int1d("P gwr_np_kgts", gwr%dtset%gwr_np_kgts) 3950 call ydoc%add_int1d("P np_kibz", gwr%np_kibz) 3951 call ydoc%add_int1d("P np_qibz", gwr%np_qibz) 3952 ! Print Max error due to the inhomogeneous FT. 3953 call ydoc%add_real("min_transition_energy_eV", gwr%te_min) 3954 call ydoc%add_real("max_transition_energy_eV", gwr%te_max) 3955 call ydoc%add_real("eratio", gwr%te_max / gwr%te_min) 3956 call ydoc%add_real("ft_max_err_t2w_cos", gwr%ft_max_error(1)) 3957 call ydoc%add_real("ft_max_err_w2t_cos", gwr%ft_max_error(2)) 3958 call ydoc%add_real("ft_max_err_t2w_sin", gwr%ft_max_error(3)) 3959 call ydoc%add_real("cosft_duality_error", gwr%cosft_duality_error) 3960 ! Print imaginary time/frequency mesh with weights. 3961 call ydoc%open_tabular("Minimax imaginary tau/omega mesh", comment="tau, weight(tau), omega, weight(omega)") 3962 do ii=1,gwr%ntau 3963 write(msg, "(i0, 4(es12.5,2x))")ii, gwr%tau_mesh(ii), gwr%tau_wgs(ii), gwr%iw_mesh(ii), gwr%iw_wgs(ii) 3964 call ydoc%add_tabular_line(msg) 3965 end do 3966 3967 call ydoc%write_units_and_free(units) 3968 3969 end subroutine gwr_print
m_gwr/gwr_print_mem [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_print_mem
FUNCTION
Print memory allocated for matrices.
SOURCE
3981 subroutine gwr_print_mem(gwr, unit) 3982 3983 !Arguments ------------------------------------ 3984 class(gwr_t),intent(in) :: gwr 3985 integer,optional,intent(in) :: unit 3986 3987 !Local variables------------------------------- 3988 !scalars 3989 integer :: unt 3990 real(dp) :: mem_mb 3991 !character(len=500) :: msg 3992 3993 ! ********************************************************************* 3994 3995 unt = std_out; if (present(unit)) unt =unit 3996 3997 if (allocated(gwr%gt_kibz)) then 3998 mem_mb = sum(slk_array_locmem_mb(gwr%gt_kibz)) 3999 if (mem_mb > zero) then 4000 call wrtout(std_out, sjoin("- Local memory for G(g,g',kibz,itau): ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4001 end if 4002 end if 4003 if (allocated(gwr%tchi_qibz)) then 4004 mem_mb = sum(slk_array_locmem_mb(gwr%tchi_qibz)) 4005 if (mem_mb > zero) then 4006 call wrtout(std_out, sjoin("- Local memory for Chi(g,g',qibz,itau): ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4007 end if 4008 end if 4009 if (allocated(gwr%wc_qibz)) then 4010 mem_mb = sum(slk_array_locmem_mb(gwr%wc_qibz)) 4011 if (mem_mb > zero) then 4012 call wrtout(std_out, sjoin("- Local memory for Wc(g,g,qibz,itau): ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4013 end if 4014 end if 4015 if (allocated(gwr%sigc_kibz)) then 4016 mem_mb = sum(slk_array_locmem_mb(gwr%sigc_kibz)) 4017 if (mem_mb > zero) then 4018 call wrtout(std_out, sjoin("- Local memory for Sigma_c(g,g',kibz,itau): ", ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4019 end if 4020 end if 4021 if (allocated(gwr%ugb)) then 4022 mem_mb = sum(slk_array_locmem_mb(gwr%ugb)) 4023 if (mem_mb > zero) then 4024 call wrtout(std_out, sjoin('- Local memory for u_gb wavefunctions: ', ftoa(mem_mb, fmt="f8.1"), ' [Mb] <<< MEM')) 4025 end if 4026 end if 4027 call wrtout(std_out, " ") 4028 4029 end subroutine gwr_print_mem
m_gwr/gwr_print_trace [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_print_trace
FUNCTION
Print traces of PBLAS matrices to std_out and ab_out. NB: This is a global routine that should be called by all procs inside gwr%comm.
INPUTS
OUTPUT
SOURCE
4780 subroutine gwr_print_trace(gwr, what) 4781 4782 !Arguments ------------------------------------ 4783 class(gwr_t),target,intent(inout) :: gwr 4784 character(len=*),intent(in) :: what 4785 4786 !Local variables------------------------------- 4787 integer,parameter :: master = 0 4788 integer :: my_is, spin, my_it, itau, iq_ibz, ierr, my_iqi, my_iki, ik_ibz, ipm 4789 character(len=5000) :: comment 4790 integer :: units(2) 4791 complex(dp),allocatable :: ctrace3(:,:,:), ctrace4(:,:,:,:) 4792 type(__slkmat_t),contiguous, pointer :: mats(:,:,:) 4793 4794 ! ************************************************************************* 4795 4796 ! NB: The same q/k point in the IBZ might be available on different procs in kpt_comm 4797 ! hence we have to rescale the trace before summing the results in gwr%comm. 4798 comment = "Invalid space!"; units = [std_out, ab_out] 4799 4800 select case (what) 4801 case ("tchi_qibz", "wc_qibz") 4802 ! Trace of tchi or Wc 4803 ABI_CALLOC(ctrace3, (gwr%nqibz, gwr%ntau, gwr%nsppol)) 4804 4805 if (what == "tchi_qibz") then 4806 mats => gwr%tchi_qibz 4807 if (gwr%tchi_space == "iomega") comment = " (iq_ibz, iomega) table" 4808 if (gwr%tchi_space == "itau") comment = " (iq_ibz, itau) table" 4809 else if (what == "wc_qibz") then 4810 mats => gwr%wc_qibz 4811 if (gwr%wc_space == "iomega") comment = " (iq_ibz, iomega) table" 4812 if (gwr%wc_space == "itau") comment = " (iq_ibz, itau) table" 4813 end if 4814 4815 do my_is=1,gwr%my_nspins 4816 spin = gwr%my_spins(my_is) 4817 do my_it=1,gwr%my_ntau 4818 itau = gwr%my_itaus(my_it) 4819 do my_iqi=1,gwr%my_nqibz 4820 iq_ibz = gwr%my_qibz_inds(my_iqi) 4821 ctrace3(iq_ibz, itau, spin) = mats(iq_ibz, itau, spin)%get_trace() / gwr%np_qibz(iq_ibz) 4822 end do 4823 end do 4824 end do 4825 4826 call xmpi_sum_master(ctrace3, 0, gwr%kts_comm%value, ierr) 4827 4828 if (gwr%comm%me == master) then 4829 do spin=1,gwr%nsppol 4830 call wrtout(units, sjoin(" Trace of:", what, "for spin:", itoa(spin), "for testing purposes:")) 4831 call wrtout(units, comment, pre_newlines=2) 4832 call print_arr(ctrace3(:,:,spin), unit=ab_out) 4833 call print_arr(ctrace3(:,:,spin), unit=std_out) 4834 end do 4835 end if 4836 ABI_FREE(ctrace3) 4837 4838 case ("gt_kibz") 4839 ! Trace of Green's functions. 4840 ABI_CALLOC(ctrace4, (gwr%nkibz, gwr%ntau, 2, gwr%nsppol)) 4841 4842 do my_is=1,gwr%my_nspins 4843 spin = gwr%my_spins(my_is) 4844 do my_it=1,gwr%my_ntau 4845 itau = gwr%my_itaus(my_it) 4846 do my_iki=1,gwr%my_nkibz 4847 ik_ibz = gwr%my_kibz_inds(my_iki) 4848 do ipm=1,2 4849 ctrace4(ik_ibz, itau, ipm, spin) = gwr%gt_kibz(ipm, ik_ibz, itau, spin)%get_trace() / gwr%np_kibz(ik_ibz) 4850 end do 4851 end do 4852 end do 4853 end do 4854 comment = " (ik_ibz, itau) table" 4855 4856 call xmpi_sum_master(ctrace4, master, gwr%kts_comm%value, ierr) 4857 4858 if (gwr%comm%me == master) then 4859 do spin=1,gwr%nsppol 4860 do ipm=1,2 4861 call wrtout(units, sjoin(" Trace of:", what, "for ipm:", itoa(ipm), ", spin:", itoa(spin), "for testing purposes:")) 4862 call wrtout(units, comment, newlines=1) 4863 call print_arr(ctrace4(:,:, ipm, spin), unit=ab_out) 4864 call print_arr(ctrace4(:,:, ipm, spin), unit=std_out) 4865 end do 4866 end do 4867 end if 4868 ABI_FREE(ctrace4) 4869 4870 case default 4871 ABI_ERROR(sjoin("Invalid value of what:", what)) 4872 end select 4873 4874 end subroutine gwr_print_trace
m_gwr/gwr_read_ugb_from_wfk [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_read_ugb_from_wfk
FUNCTION
Read wavefunctions from the WFK file wfk_path and store them in gwr%ugb (MPI distributed).
SOURCE
2130 subroutine gwr_read_ugb_from_wfk(gwr, wfk_path) 2131 2132 !Arguments ------------------------------------ 2133 class(gwr_t),target,intent(inout) :: gwr 2134 character(len=*),intent(in) :: wfk_path 2135 2136 !Local variables------------------------------- 2137 !scalars 2138 integer,parameter :: formeig0 = 0, master = 0 2139 integer :: mband, min_nband, nkibz, nsppol, my_is, my_iki, spin, ik_ibz, ierr, io_algo, bcast_comm, color 2140 integer :: npw_k, mpw, istwf_k, il_b, ib, band, iloc !, itau 2141 integer :: nbsum, npwsp, bstart, bstop, band_step, nb !my_nband ! nband_k, 2142 logical :: print_time 2143 real(dp) :: cpu, wall, gflops, cpu_green, wall_green, gflops_green 2144 character(len=5000) :: msg 2145 logical :: have_band, need_block_ks, io_in_kcomm 2146 type(ebands_t) :: wfk_ebands 2147 type(wfk_t) :: wfk 2148 type(dataset_type),pointer :: dtset 2149 type(xcomm_t), pointer :: io_comm 2150 !arrays 2151 integer,allocatable :: kg_k(:,:) 2152 logical,allocatable :: bmask(:) 2153 real(dp) :: kk_ibz(3), tsec(2) 2154 real(dp),target,allocatable :: cg_work(:,:,:) 2155 real(dp),ABI_CONTIGUOUS pointer :: cg_k(:,:) 2156 2157 ! ************************************************************************* 2158 2159 call cwtime(cpu, wall, gflops, "start") 2160 call timab(1921, 1, tsec) 2161 2162 dtset => gwr%dtset 2163 wfk_ebands = wfk_read_ebands(wfk_path, gwr%comm%value, out_hdr=gwr%wfk_hdr) 2164 call gwr%wfk_hdr%vs_dtset(dtset) 2165 2166 ! TODO: Add more consistency checks e.g. nkibz,... 2167 !cryst = gwr%wfk_hdr%get_crystal() 2168 !call cryst%print(header="crystal structure from WFK file") 2169 2170 nkibz = wfk_ebands%nkpt; nsppol = wfk_ebands%nsppol; mband = wfk_ebands%mband 2171 min_nband = minval(wfk_ebands%nband) 2172 2173 nbsum = dtset%nband(1) 2174 if (nbsum > min_nband) then 2175 ABI_WARNING(sjoin("WFK file contains", itoa(min_nband), "states while you're asking for:", itoa(nbsum))) 2176 nbsum = min_nband 2177 end if 2178 call ebands_free(wfk_ebands) 2179 2180 ! ============================================== 2181 ! Build Green's functions in g-space for given k 2182 ! ============================================== 2183 2184 ! for tau > 0: 2185 ! 2186 ! G_k(r,r',itau) = i \sum_b^{occ} psi_b(r) \psi_b^*(r') exp(e_b tau) 2187 ! 2188 ! for tau < 0: 2189 ! 2190 ! G_k(r,r',itau) = -i \sum_b^{empty} psi_b(r) \psi_b^*(r') exp(e_b tau) 2191 ! 2192 ! NB: G_k is constructed for k in the IBZ, then we rotate the k-point to obtain G_k in the BZ. 2193 ! 2194 ! TODO: 2195 ! 1) Make sure that gvec in gwr and wfd agree with each other. 2196 ! 2) May implement trick used in gwst to add empty states approximated with LC of PWs. 2197 2198 ! Select occupied or empty G. 2199 ! if (eig_nk < -tol6) then 2200 ! !ipm = 1 2201 ! !gt_cfact = j_dpc * exp(gwr%tau_mesh(itau) * eig_nk) 2202 ! ! Vasp convention 2203 ! ipm = 2 2204 ! gt_cfact = exp(gwr%tau_mesh(itau) * eig_nk) 2205 ! else if (eig_nk > tol6) then 2206 ! !ipm = 2 2207 ! !gt_cfact = -j_dpc * exp(-gwr%tau_mesh(itau) * eig_nk) 2208 ! ! Vasp convention 2209 ! ipm = 1 2210 ! gt_cfact = -exp(-gwr%tau_mesh(itau) * eig_nk) 2211 ! else 2212 ! ABI_WARNING("Metallic system of semiconductor with Fermi level inside bands!!!!") 2213 ! end if 2214 2215 call wrtout(std_out, sjoin(" Reading KS states with nbsum:", itoa(nbsum), "..."), do_flush=.True.) 2216 2217 ! Init set of (npwsp, nbsum) PBLAS matrix distributed within the g_comm communicator. 2218 ! and distribute it over bands so that each proc reads a subset of bands in read_band_block 2219 ! Note size_blocs below that corresponds to a round-robin distribution along the band axis. 2220 2221 ABI_MALLOC(gwr%ugb, (gwr%nkibz, gwr%nsppol)) 2222 gwr%ugb_nband = nbsum 2223 2224 do my_is=1,gwr%my_nspins 2225 spin = gwr%my_spins(my_is) 2226 do my_iki=1,gwr%my_nkibz 2227 ik_ibz = gwr%my_kibz_inds(my_iki) 2228 npw_k = gwr%green_desc_kibz(ik_ibz)%npw; npwsp = npw_k * gwr%nspinor 2229 call gwr%ugb(ik_ibz, spin)%init(npwsp, gwr%ugb_nband, gwr%g_slkproc, istwfk1, size_blocs=[-1, 1]) 2230 end do 2231 end do 2232 call gwr%print_mem(unit=std_out) 2233 2234 mpw = maxval(gwr%wfk_hdr%npwarr) 2235 ABI_MALLOC(kg_k, (3, mpw)) 2236 ABI_MALLOC(bmask, (mband)) 2237 2238 io_algo = 2 2239 2240 if (io_algo == 1) then 2241 ! This version is very bad on LUMI 2242 call wrtout(std_out, " Using collective MPI-IO with wfk%read_bmask ...") 2243 call wfk_open_read(wfk, wfk_path, formeig0, iomode_from_fname(wfk_path), get_unit(), gwr%gtau_comm%value) 2244 !call wfk_open_read(wfk, wfk_path, formeig0, iomode_from_fname(wfk_path), get_unit(), gwr%gt_comm%value) 2245 2246 do my_is=1,gwr%my_nspins 2247 spin = gwr%my_spins(my_is) 2248 do my_iki=1,gwr%my_nkibz 2249 print_time = gwr%comm%me == 0 .and. (my_iki <= LOG_MODK .or. mod(my_iki, LOG_MODK) == 0) 2250 if (print_time) call cwtime(cpu_green, wall_green, gflops_green, "start") 2251 ik_ibz = gwr%my_kibz_inds(my_iki); kk_ibz = gwr%kibz(:, ik_ibz) 2252 npw_k = gwr%wfk_hdr%npwarr(ik_ibz); istwf_k = gwr%wfk_hdr%istwfk(ik_ibz) 2253 npwsp = npw_k * gwr%nspinor 2254 ! TODO 2255 ABI_CHECK_IEQ(istwf_k, 1, "istwfk_k should be 1") 2256 2257 associate (ugb => gwr%ugb(ik_ibz, spin), desc_k => gwr%green_desc_kibz(ik_ibz)) 2258 ABI_CHECK_IEQ(npw_k, desc_k%npw, "npw_k != desc_k%npw") 2259 2260 ! use round-robin distribution inside gtau_comm% for IO. 2261 ! TODO: Optimize wfk_read_bmask and/or read WFK with all procs and master broadcasting. 2262 bmask = .False. 2263 do il_b=1, ugb%sizeb_local(2) 2264 band = ugb%loc2gcol(il_b); bmask(band) = .True. 2265 end do 2266 ! FIXME: This is wrong if spc 2267 call c_f_pointer(c_loc(ugb%buffer_cplx), cg_k, shape=[2, npwsp * ugb%sizeb_local(2)]) 2268 call wfk%read_bmask(bmask, ik_ibz, spin, & 2269 !xmpio_single, & 2270 xmpio_collective, & 2271 kg_k=kg_k, cg_k=cg_k) 2272 2273 ABI_CHECK(all(kg_k(:,1:npw_k) == desc_k%gvec), "kg_k != desc_k%gvec") 2274 2275 if (print_time) then 2276 write(msg,'(4x,3(a,i0),a)')"Read ugb_k: my_iki [", my_iki, "/", gwr%my_nkibz, "] (tot: ", gwr%nkibz, ")" 2277 call cwtime_report(msg, cpu_green, wall_green, gflops_green); if (my_iki == LOG_MODK) call wrtout(std_out, " ...") 2278 end if 2279 end associate 2280 end do ! my_iki 2281 end do ! my_is 2282 2283 call wfk%close() 2284 2285 else 2286 ! Master reads and broadcasts. Much faster on lumi 2287 call wrtout(std_out, " Using IO version based on master reads and brodcasts ...") 2288 io_comm => gwr%comm; io_in_kcomm = .False. 2289 io_comm => gwr%kpt_comm; io_in_kcomm = .True. 2290 2291 if (io_comm%me == master) then 2292 call wfk_open_read(wfk, wfk_path, formeig0, iomode_from_fname(wfk_path), get_unit(), xmpi_comm_self) 2293 end if 2294 2295 ! TODO This to be able to maximize the size of cg_work 2296 !call gwr%pstat%mpi_max(vmrss_mb, gwr%comm%value) 2297 2298 do spin=1,gwr%nsppol 2299 if (io_in_kcomm .and. .not. any(gwr%my_spins == spin)) cycle 2300 2301 do ik_ibz=1,gwr%nkibz 2302 print_time = gwr%comm%me == 0 .and. (ik_ibz < LOG_MODK .or. mod(ik_ibz, LOG_MODK) == 0) 2303 if (print_time) call cwtime(cpu_green, wall_green, gflops_green, "start") 2304 kk_ibz = gwr%kibz(:, ik_ibz) 2305 npw_k = gwr%wfk_hdr%npwarr(ik_ibz); istwf_k = gwr%wfk_hdr%istwfk(ik_ibz); npwsp = npw_k * gwr%nspinor 2306 ABI_CHECK_IEQ(istwf_k, 1, "istwfk_k should be 1") 2307 2308 ! Create communicator with master and all procs requiring this (k,s) block (color == 1) 2309 need_block_ks = any(gwr%my_spins == spin) .and. any(gwr%my_kibz_inds == ik_ibz) 2310 color = merge(1, 0, (need_block_ks .or. io_comm%me == master)) 2311 call xmpi_comm_split(io_comm%value, color, io_comm%me, bcast_comm, ierr) 2312 2313 ! TODO: Optimize this part 2314 ! Find band_step that gives good compromise between memory and efficiency. 2315 band_step = memb_limited_step(1, nbsum, 2*npwsp, xmpi_bsize_dp, 1024.0_dp) 2316 band_step = 200 2317 !band_step = 100 2318 do bstart=1, nbsum, band_step 2319 bstop = min(bstart + band_step - 1, nbsum); nb = bstop - bstart + 1 2320 2321 ABI_MALLOC(cg_work, (2, npwsp, nb)) ! This array is always dp 2322 if (io_comm%me == master) then 2323 call c_f_pointer(c_loc(cg_work), cg_k, shape=[2, npwsp * nb]) 2324 call wfk%read_band_block([bstart, bstop], ik_ibz, spin, xmpio_single, kg_k=kg_k, cg_k=cg_k) 2325 end if 2326 2327 if (color == 1) then 2328 call xmpi_bcast(kg_k, master, bcast_comm, ierr) 2329 call xmpi_bcast(cg_work, master, bcast_comm, ierr) 2330 endif 2331 2332 ! Copy my portion of cg_work to buffer_cplx (here we have dp --> sp conversion). 2333 if (need_block_ks) then 2334 associate (ugb => gwr%ugb(ik_ibz, spin), desc_k => gwr%green_desc_kibz(ik_ibz)) 2335 ABI_CHECK(all(kg_k(:,1:npw_k) == desc_k%gvec), "kg_k != desc_k%gvec") 2336 do band=bstart, bstop 2337 ib = band - bstart + 1 2338 call ugb%glob2loc(1, band, iloc, il_b, have_band); if (.not. have_band) cycle 2339 ugb%buffer_cplx(:, il_b) = cmplx(cg_work(1,:,ib), cg_work(2,:,ib), kind=gwpc) 2340 end do 2341 end associate 2342 end if 2343 2344 ABI_FREE(cg_work) 2345 end do ! bstart 2346 2347 call xmpi_comm_free(bcast_comm) 2348 2349 if (print_time) then 2350 write(msg,'(4x,2(a,i0),a)')"Read ugb_k: ik_ibz [", ik_ibz, "/", gwr%nkibz, "]" 2351 call cwtime_report(msg, cpu_green, wall_green, gflops_green); if (ik_ibz == LOG_MODK) call wrtout(std_out, " ...") 2352 end if 2353 end do ! ik_ibz 2354 end do ! spin 2355 if (io_comm%me == master) call wfk%close() 2356 end if ! io_algo 2357 2358 ABI_FREE(kg_k) 2359 ABI_FREE(bmask) 2360 call gwr%print_mem(unit=std_out) 2361 2362 call cwtime_report(" gwr_read_ugb_from_wfk:", cpu, wall, gflops) 2363 call timab(1921, 2, tsec) 2364 2365 end subroutine gwr_read_ugb_from_wfk
m_gwr/gwr_redistrib_gt_kibz [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_redistrib_gt_kibz
FUNCTION
If action == "communicate": Redistribute G_k for fixed (itau, spin) according to `need_kibz` table. Also, set got_kibz to 1 for each IBZ k-point that has been received. If action == "free": Use input `got_kibz` array to deallocate matrices.
INPUTS
OUTPUT
SOURCE
4564 subroutine gwr_redistrib_gt_kibz(gwr, itau, spin, need_kibz, got_kibz, action) 4565 4566 !Arguments ------------------------------------ 4567 class(gwr_t),target,intent(inout) :: gwr 4568 integer,intent(in) :: itau, spin, need_kibz(gwr%nkibz) 4569 integer,intent(inout) :: got_kibz(gwr%nkibz) 4570 character(len=*),intent(in) :: action 4571 !integer,optional,intent(in) :: ipm_list(:) 4572 4573 !Local variables------------------------------- 4574 integer :: ik_ibz, ipm, ierr, do_mpi_kibz(gwr%nkibz), sender_kibz(gwr%nkibz) 4575 integer :: bcast_comm, sender_in_bcast_comm, color 4576 logical :: im_sender 4577 !integer :: num_pm, ipm_list__(2) 4578 real(dp) :: kk_ibz(3), cpu, wall, gflops 4579 complex(gwpc),contiguous, pointer :: ck_ptr(:,:) 4580 4581 ! ************************************************************************* 4582 4583 call cwtime(cpu, wall, gflops, "start") 4584 !num_pm = 2; ipm_list__ = [1, 2] 4585 !if (present(ipm_list)) then 4586 ! num_pm = size(ipm_list) 4587 ! ABI_CHECK_IRANGE(num_pm, 1, 2, "num_pm not in [1, 2]") 4588 ! ipm_list__(1:num_pm) = ipm_list(:) 4589 !end if 4590 4591 select case (action) 4592 case ("communicate") 4593 do_mpi_kibz = need_kibz 4594 do ik_ibz=1,gwr%nkibz 4595 if (allocated(gwr%green_desc_kibz(ik_ibz)%gvec)) do_mpi_kibz(ik_ibz) = 0 4596 end do 4597 call xmpi_sum(do_mpi_kibz, gwr%kpt_comm%value, ierr) 4598 !do_mpi_kibz = 1 4599 4600 ! All procs enter the loop. Sender_kibz stores the rank of the sender in gwr%kpt_comm 4601 got_kibz = 0; sender_kibz(:) = huge(1) 4602 do ik_ibz=1,gwr%nkibz 4603 if (do_mpi_kibz(ik_ibz) == 0) cycle 4604 kk_ibz = gwr%kibz(:, ik_ibz) 4605 if (allocated(gwr%green_desc_kibz(ik_ibz)%gvec)) sender_kibz(ik_ibz) = gwr%kpt_comm%me 4606 if (need_kibz(ik_ibz) /= 0 .and. .not. allocated(gwr%green_desc_kibz(ik_ibz)%gvec)) then 4607 ! NB: Use same args as those used to init the descriptors in gwr_init 4608 ! so that gvec ordering is consistent across MPI procs. 4609 got_kibz(ik_ibz) = 1 4610 call gwr%green_desc_kibz(ik_ibz)%init(kk_ibz, istwfk1, gwr%dtset%ecut, gwr) 4611 end if 4612 end do 4613 4614 ! Define the sender for each kibz in do_mpi_kibz 4615 call xmpi_min_ip(sender_kibz, gwr%kpt_comm%value, ierr) 4616 4617 ! Allocate memory 4618 call gwr%malloc_free_mats(got_kibz, "green", "malloc") 4619 4620 ! MPI communication 4621 do ik_ibz=1,gwr%nkibz 4622 if (do_mpi_kibz(ik_ibz) == 0) cycle 4623 4624 ! Create subcommunicators with color and bcast only inside subcomm. 4625 im_sender = gwr%kpt_comm%me == sender_kibz(ik_ibz) 4626 color = merge(1, 0, im_sender .or. need_kibz(ik_ibz) /= 0) 4627 call xmpi_comm_split(gwr%kpt_comm%value, color, gwr%kpt_comm%me, bcast_comm, ierr) 4628 4629 if (color == 1) then 4630 sender_in_bcast_comm = xmpi_comm_translate_rank(gwr%kpt_comm%value, sender_kibz(ik_ibz), bcast_comm) 4631 do ipm=1,2 4632 ck_ptr => gwr%gt_kibz(ipm, ik_ibz, itau, spin)%buffer_cplx 4633 call xmpi_bcast(ck_ptr, sender_in_bcast_comm, bcast_comm, ierr) 4634 end do 4635 end if 4636 call xmpi_comm_free(bcast_comm) 4637 end do 4638 4639 case ("free") 4640 ! Use got_kibz to free previously allocated memory. 4641 do ik_ibz=1,gwr%nkibz 4642 if (got_kibz(ik_ibz) == 1) call gwr%green_desc_kibz(ik_ibz)%free() 4643 end do 4644 call gwr%malloc_free_mats(got_kibz, "green", "free") 4645 4646 case default 4647 ABI_ERROR(sjoin("Invalid action:", action)) 4648 end select 4649 4650 if (action == "communicate") call cwtime_report(" gwr_redistrib_gt_kibz:", cpu, wall, gflops) 4651 4652 end subroutine gwr_redistrib_gt_kibz
m_gwr/gwr_redistrib_mats_qibz [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_redistrib_mats_qibz
FUNCTION
If action == "communicate": Redistribute chi_q (wc_q) for fixed (itau, spin) according to `need_qibz` table. Also, set `got_qibz` to 1 for each IBZ q-point that has been received. If action == "free": Use `got_qibz` to deallocate matrices received in a previous call with "communicate".
INPUTS
OUTPUT
SOURCE
4672 subroutine gwr_redistrib_mats_qibz(gwr, what, itau, spin, need_qibz, got_qibz, action) 4673 4674 !Arguments ------------------------------------ 4675 class(gwr_t),target,intent(inout) :: gwr 4676 character(len=*),intent(in) :: what 4677 integer,intent(in) :: itau, spin, need_qibz(gwr%nqibz) 4678 integer,intent(inout) :: got_qibz(gwr%nqibz) 4679 character(len=*),intent(in) :: action 4680 4681 !Local variables------------------------------- 4682 integer :: iq_ibz, ierr, bcast_comm, color, do_mpi_qibz(gwr%nqibz), sender_qibz(gwr%nqibz), sender_in_bcast_comm 4683 logical :: im_sender 4684 logical, parameter :: timeit = .False. 4685 real(dp) :: qq_ibz(3), cpu, wall, gflops 4686 complex(gwpc),contiguous, pointer :: cq_ptr(:,:) 4687 4688 ! ************************************************************************* 4689 4690 ABI_CHECK(what == "tchi" .or. what == "wc", sjoin("Invalid what:", what)) 4691 if (timeit) call cwtime(cpu, wall, gflops, "start") 4692 4693 select case (action) 4694 case ("communicate") 4695 do_mpi_qibz = need_qibz 4696 do iq_ibz=1,gwr%nqibz 4697 select case (what) 4698 case ("tchi") 4699 if (allocated(gwr%tchi_qibz(iq_ibz, itau, spin)%buffer_cplx)) do_mpi_qibz(iq_ibz) = 0 4700 case ("wc") 4701 if (allocated(gwr%wc_qibz(iq_ibz, itau, spin)%buffer_cplx)) do_mpi_qibz(iq_ibz) = 0 4702 case default 4703 ABI_ERROR(sjoin("Invalid what:", what)) 4704 end select 4705 end do 4706 4707 call xmpi_sum(do_mpi_qibz, gwr%kpt_comm%value, ierr) 4708 !do_mpi_qibz = 1 4709 4710 ! All procs enter the loop. Sender_qibz stores the rank of the sender in gwr%kpt_comm 4711 got_qibz = 0; sender_qibz(:) = huge(1) 4712 do iq_ibz=1,gwr%nqibz 4713 if (do_mpi_qibz(iq_ibz) == 0) cycle 4714 qq_ibz = gwr%qibz(:, iq_ibz) 4715 if (allocated(gwr%tchi_desc_qibz(iq_ibz)%gvec)) sender_qibz(iq_ibz) = gwr%kpt_comm%me 4716 if (need_qibz(iq_ibz) /= 0 .and. .not. allocated(gwr%tchi_desc_qibz(iq_ibz)%gvec)) then 4717 ! NB: Use same args as those used to init the descriptors in gwr_init 4718 ! so that gvec ordering is consistent across MPI procs. 4719 got_qibz(iq_ibz) = 1 4720 call gwr%tchi_desc_qibz(iq_ibz)%init(qq_ibz, istwfk1, gwr%dtset%ecuteps, gwr, kin_sorted=.True.) 4721 end if 4722 end do 4723 4724 ! Define the sender for each qibz in do_mpi_qibz 4725 call xmpi_min_ip(sender_qibz, gwr%kpt_comm%value, ierr) 4726 4727 ! Allocate memory 4728 call gwr%malloc_free_mats(got_qibz, what, "malloc") 4729 4730 ! MPI communication 4731 do iq_ibz=1,gwr%nqibz 4732 if (do_mpi_qibz(iq_ibz) == 0) cycle 4733 4734 ! Create subcommunicators with color and bcast only inside subcomm. 4735 im_sender = gwr%kpt_comm%me == sender_qibz(iq_ibz) 4736 color = merge(1, 0, im_sender .or. need_qibz(iq_ibz) /= 0) 4737 call xmpi_comm_split(gwr%kpt_comm%value, color, gwr%kpt_comm%me, bcast_comm, ierr) 4738 4739 if (color == 1) then 4740 if (what == "tchi") cq_ptr => gwr%tchi_qibz(iq_ibz, itau, spin)%buffer_cplx 4741 if (what == "wc") cq_ptr => gwr%wc_qibz(iq_ibz, itau, spin)%buffer_cplx 4742 sender_in_bcast_comm = xmpi_comm_translate_rank(gwr%kpt_comm%value, sender_qibz(iq_ibz), bcast_comm) 4743 call xmpi_bcast(cq_ptr, sender_in_bcast_comm, bcast_comm, ierr) 4744 end if 4745 call xmpi_comm_free(bcast_comm) 4746 end do ! iq_ibz 4747 4748 case ("free") 4749 ! Use got_qibz table to free previously allocated memory 4750 do iq_ibz=1,gwr%nqibz 4751 if (got_qibz(iq_ibz) /= 0) call gwr%tchi_desc_qibz(iq_ibz)%free() 4752 end do 4753 call gwr%malloc_free_mats(got_qibz, what, "free") 4754 4755 case default 4756 ABI_ERROR(sjoin("Invalid action:", action)) 4757 end select 4758 4759 if (timeit) call cwtime_report(" gwr_redistrib_mats_qibz:", cpu, wall, gflops) 4760 4761 end subroutine gwr_redistrib_mats_qibz
m_gwr/gwr_rotate_gpm [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_rotate_gpm
FUNCTION
Reconstruct the Green's functions in the BZ from the IBZ.
INPUTS
ik_bz = Index of the k-point in the BZ itau = tau index (global index) spin = spin index (global index) [ipm_list]=Optional list of ipm indices to be considered, e.g. ipm_list=[2] to compute the -tau component.
OUTPUT
desc_kbz = Descriptor in the BZ gt_pm(2) = Gk(+/-tau)
NOTES
* Remember the symmetry properties of \tilde\espilon^{-1} If q_bz = S q_ibz + G0: $\epsilon^{-1}_{SG1-G0, SG2-G0}(q_bz) = e^{+iS(G2-G1).\tau} \epsilon^{-1}_{G1, G2)}(q) If time-reversal symmetry can be used then: $\epsilon^{-1}_{G1,G2}(-q_bz) = e^{+i(G1-G2).\tau} \epsilon^{-1}_{-S^{-1}(G1+Go), -S^{-1}(G2+G0)}^*(q) In the present implementation we are not considering a possible umklapp vector G0 in the expression Sq = q+G0. Treating this case would require some changes in the G-sphere since we have to consider G - G0. The code however stops in sigma if a nonzero G0 is required to reconstruct the BZ.
SOURCE
2749 subroutine gwr_rotate_gpm(gwr, ik_bz, itau, spin, desc_kbz, gt_pm, ipm_list) 2750 2751 !Arguments ------------------------------------ 2752 class(gwr_t),intent(in) :: gwr 2753 integer,intent(in) :: ik_bz, spin, itau 2754 type(desc_t),intent(out) :: desc_kbz 2755 type(__slkmat_t),intent(out) :: gt_pm(2) 2756 integer,optional,intent(in) :: ipm_list(:) 2757 2758 !Local variables------------------------------- 2759 !scalars 2760 integer :: ig1, ig2, il_g1, il_g2, ipm, ik_ibz, isym_k, trev_k, g0_k(3), tsign_k, ii, num_pm, ipm_list__(2) 2761 logical :: isirr_k 2762 !arrays 2763 integer :: g1(3), g2(3) 2764 real(dp) :: tnon(3) !, cpu, wall, gflops 2765 complex(dp) :: ph2, ph1 2766 2767 ! ************************************************************************* 2768 2769 !call cwtime(cpu, wall, gflops, "start") 2770 num_pm = 2; ipm_list__ = [1, 2] 2771 if (present(ipm_list)) then 2772 num_pm = size(ipm_list) 2773 ABI_CHECK_IRANGE(num_pm, 1, 2, "num_pm not in [1, 2]") 2774 ipm_list__(1:num_pm) = ipm_list(:) 2775 end if 2776 2777 ik_ibz = gwr%kbz2ibz(1, ik_bz); isym_k = gwr%kbz2ibz(2, ik_bz) 2778 trev_k = gwr%kbz2ibz(6, ik_bz); g0_k = gwr%kbz2ibz(3:5, ik_bz) 2779 isirr_k = (isym_k == 1 .and. trev_k == 0 .and. all(g0_k == 0)) 2780 tsign_k = merge(1, -1, trev_k == 0) 2781 !ABI_CHECK(all(g0_k == 0), sjoin("For kbz:", ktoa(gwr%kbz(:, ik_bz)), "g0_k:", ltoa(g0_k), " != 0")) 2782 2783 ! Copy descriptor from IBZ 2784 associate (desc_kibz => gwr%green_desc_kibz(ik_ibz)) 2785 call desc_kibz%copy(desc_kbz) 2786 2787 if (isirr_k) then 2788 ! Copy the PBLAS matrices with the two Green's functions and we are done. 2789 do ii=1,num_pm 2790 ipm = ipm_list__(ii) 2791 call gwr%gt_kibz(ipm, ik_ibz, itau, spin)%copy(gt_pm(ipm)) 2792 end do 2793 goto 10 2794 end if 2795 2796 ! From: 2797 ! 2798 ! u_{Sk}(Sg) = e^{-i(Sk+g).tnon} u_k(g) 2799 ! 2800 ! and 2801 ! 2802 ! u_{k+g0}(g-g0) = u_k(g) 2803 ! 2804 ! one obtains: 2805 ! 2806 ! G_{Sk+g0}(Sg-g0,Sg'-g0) = e^{-i tnon.S(g-g')} G_k{g,g'} 2807 ! 2808 ! For time-reversal, we have u_{-k}(g) = u_{k}{-g}^* 2809 ! 2810 ! G_{-k}(-g,-g') = [G_k(g,g')]* 2811 2812 !ABI_WARNING_IF(trev_k == 0, "green: trev_k /= 0 should be tested") 2813 2814 ! Rotate gvec, recompute gbound and rotate vc_sqrt 2815 ! TODO: 1) Handle TR and routine to rotate tchi/W including vc_sqrt 2816 ! 2) Make sure that the FFT box is large enough to accommodate umklapps 2817 2818 desc_kbz%ig0 = -1 2819 do ig1=1,desc_kbz%npw 2820 desc_kbz%gvec(:,ig1) = tsign_k * matmul(gwr%cryst%symrec(:,:,isym_k), desc_kibz%gvec(:,ig1)) - g0_k 2821 if (all(desc_kbz%gvec(:,ig1) == 0)) desc_kbz%ig0 = ig1 2822 end do 2823 desc_kbz%kin_sorted = .False. 2824 ABI_CHECK(desc_kbz%ig0 /= -1, "Cannot find g=0 after rotation!") 2825 2826 call sphereboundary(desc_kbz%gbound, desc_kbz%istwfk, desc_kbz%gvec, gwr%g_mgfft, desc_kbz%npw) 2827 2828 ! Get G_k with k in the BZ. 2829 tnon = gwr%cryst%tnons(:, isym_k) 2830 do ii=1,num_pm 2831 ipm = ipm_list__(ii) 2832 associate (gk_i => gwr%gt_kibz(ipm, ik_ibz, itau, spin), gk_f => gt_pm(ipm)) 2833 call gk_i%copy(gk_f) 2834 !!$OMP PARALLEL DO PRIVATE(ig1, g2, ph2, ig1, g2, ph1) 2835 do il_g2=1, gk_f%sizeb_local(2) 2836 ig2 = mod(gk_f%loc2gcol(il_g2) - 1, desc_kbz%npw) + 1 2837 g2 = desc_kbz%gvec(:,ig2) 2838 !g2 = desc_kibz%gvec(:,ig2) 2839 ph2 = exp(+j_dpc * two_pi * dot_product(g2, tnon)) 2840 do il_g1=1, gk_f%sizeb_local(1) 2841 ig1 = mod(gk_f%loc2grow(il_g1) - 1, desc_kbz%npw) + 1 2842 g1 = desc_kbz%gvec(:,ig1) 2843 !g1 = desc_kibz%gvec(:,ig1) 2844 ph1 = exp(-j_dpc * two_pi * dot_product(g1, tnon)) 2845 gk_f%buffer_cplx(il_g1, il_g2) = gk_i%buffer_cplx(il_g1, il_g2) * ph1 * ph2 2846 if (trev_k == 1) gk_f%buffer_cplx(il_g1, il_g2) = conjg(gk_f%buffer_cplx(il_g1, il_g2)) 2847 end do 2848 end do 2849 end associate 2850 end do 2851 end associate 2852 2853 10 continue 2854 !call cwtime_report(" gwr_rotate_gpm:", cpu, wall, gflops) 2855 2856 end subroutine gwr_rotate_gpm
m_gwr/gwr_rotate_wc [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_rotate_wc
FUNCTION
Reconstruct Wc(q,g,g') in the BZ from the IBZ.
INPUTS
OUTPUT
SOURCE
3219 subroutine gwr_rotate_wc(gwr, iq_bz, itau, spin, desc_qbz, wc_qbz) 3220 3221 !Arguments ------------------------------------ 3222 class(gwr_t),intent(inout) :: gwr 3223 integer,intent(in) :: iq_bz, itau, spin 3224 type(desc_t),intent(out) :: desc_qbz 3225 type(__slkmat_t),intent(inout) :: wc_qbz 3226 3227 !Local variables------------------------------- 3228 !scalars 3229 integer :: ig1, ig2, il_g1, il_g2, iq_ibz, isym_q, trev_q, tsign_q 3230 logical :: isirr_q, q_is_gamma 3231 !arrays 3232 integer :: g1(3), g2(3), g0_q(3) 3233 real(dp) :: tnon(3), qq_bz(3) 3234 complex(dp) :: ph2, ph1 3235 3236 ! ************************************************************************* 3237 3238 ABI_CHECK(gwr%wc_space == "itau", sjoin("wc_space:", gwr%wc_space, " != itau")) 3239 3240 qq_bz = gwr%qbz(:, iq_bz) 3241 q_is_gamma = normv(qq_bz, gwr%cryst%gmet, "G") < GW_TOLQ0 3242 3243 iq_ibz = gwr%qbz2ibz(1, iq_bz); isym_q = gwr%qbz2ibz(2, iq_bz) 3244 trev_q = gwr%qbz2ibz(6, iq_bz); g0_q = gwr%qbz2ibz(3:5, iq_bz) 3245 isirr_q = (isym_q == 1 .and. trev_q == 0 .and. all(g0_q == 0)) 3246 tsign_q = merge(1, -1, trev_q == 0) 3247 ! TODO: Understand why legacy GW does not need umklapp 3248 !ABI_CHECK(all(g0_q == 0), sjoin("For qbz:", ktoa(gwr%qbz(:, iq_bz)), "g0_q:", ltoa(g0_q), " != 0")) 3249 3250 ! Copy descriptor from IBZ 3251 associate (desc_qibz => gwr%tchi_desc_qibz(iq_ibz)) 3252 call desc_qibz%copy(desc_qbz) 3253 3254 if (isirr_q) then 3255 ! Copy the PBLAS matrix in wc_qbz and we are done. 3256 call gwr%wc_qibz(iq_ibz, itau, spin)%copy(wc_qbz); return 3257 end if 3258 3259 !ABI_WARNING_IF(trev_q == 0, "trev_q should be tested") 3260 ! rotate gvec, recompute gbound and rotate vc_sqrt. 3261 ! TODO: 1) Handle TR and routine to rotate tchi/W including vc_sqrt 3262 ! 2) Make sure that FFT box is large enough to accomodate umklapps 3263 desc_qbz%ig0 = -1 3264 do ig1=1,desc_qbz%npw 3265 desc_qbz%gvec(:,ig1) = tsign_q * matmul(gwr%cryst%symrec(:,:,isym_q), desc_qibz%gvec(:,ig1)) - g0_q 3266 if (all(desc_qbz%gvec(:,ig1) == 0)) desc_qbz%ig0 = ig1 3267 end do 3268 desc_qbz%kin_sorted = .False. 3269 ABI_CHECK(desc_qbz%ig0 /= -1, "Cannot find g = 0 after g-vector rotation!") 3270 3271 call sphereboundary(desc_qbz%gbound, desc_qbz%istwfk, desc_qbz%gvec, gwr%g_mgfft, desc_qbz%npw) 3272 3273 ! Compute sqrt(vc(q,G)) 3274 ! TODO: rotate vc_sqrt 3275 ! vc(Sq, Sg) = vc(q, g) 3276 ! vc(-q, -g) = vc(q, g) 3277 call desc_qbz%get_vc_sqrt(qq_bz, q_is_gamma, gwr, gwr%gtau_comm%value) 3278 3279 ! Get Wc_q with q in the BZ. 3280 tnon = gwr%cryst%tnons(:, isym_q) 3281 associate (wq_i => gwr%wc_qibz(iq_ibz, itau, spin), wq_f => wc_qbz) 3282 call wq_i%copy(wc_qbz) 3283 3284 !!!$OMP PARALLEL DO PRIVATE(ig2, g2, phs2, ig1, g2, ph1) 3285 do il_g2=1, wq_f%sizeb_local(2) 3286 ig2 = mod(wq_f%loc2gcol(il_g2) - 1, desc_qbz%npw) + 1 3287 g2 = desc_qbz%gvec(:,ig2) 3288 !g2 = desc_qibz%gvec(:,ig2) 3289 !ph2 = exp(-j_dpc * two_pi * dot_product(g2, tnon)) 3290 ph2 = exp(+j_dpc * two_pi * dot_product(g2, tnon)) 3291 do il_g1=1, wq_f%sizeb_local(1) 3292 ig1 = mod(wq_f%loc2grow(il_g1) - 1, desc_qbz%npw) + 1 3293 g1 = desc_qbz%gvec(:,ig1) 3294 !g1 = desc_qibz%gvec(:,ig1) 3295 !ph1 = exp(+j_dpc * two_pi * dot_product(g1, tnon)) 3296 ph1 = exp(-j_dpc * two_pi * dot_product(g1, tnon)) 3297 wq_f%buffer_cplx(il_g1, il_g2) = wq_i%buffer_cplx(il_g1, il_g2) * ph1 * ph2 3298 if (trev_q == 1) wq_f%buffer_cplx(il_g1, il_g2) = conjg(wq_f%buffer_cplx(il_g1, il_g2)) 3299 end do 3300 end do 3301 end associate 3302 end associate 3303 3304 end subroutine gwr_rotate_wc
m_gwr/gwr_rpa_energy [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_rpa_energy
FUNCTION
Compute the correlated part of the total energy within ACFDT.
INPUTS
OUTPUT
SOURCE
6100 subroutine gwr_rpa_energy(gwr) 6101 6102 !Arguments ------------------------------------ 6103 class(gwr_t),target,intent(inout) :: gwr 6104 6105 !Local variables------------------------------- 6106 !scalars 6107 integer,parameter :: master = 0 6108 integer :: my_is, my_iqi, my_it, itau, spin, iq_ibz, ii, ierr, ig, ncut, icut, mat_size 6109 integer :: il_g1, il_g2, ig1, ig2, npw_q, ig0, ncid, ncerr 6110 logical :: q_is_gamma, print_time 6111 real(dp) :: weight, qq_ibz(3), estep, aa, bb, rmsq, ecut_soft, damp, tsec(2) 6112 real(dp) :: cpu_all, wall_all, gflops_all, cpu_q, wall_q, gflops_q, cpu_cut, wall_cut, gflops_cut 6113 complex(dpc) :: vcs_g1, vcs_g2 6114 type(desc_t),pointer :: desc_q 6115 character(len=500) :: msg 6116 !arrays 6117 type(__slkmat_t) :: chi_tmp, dummy_vec, chi_4diag 6118 type(processor_scalapack) :: proc_4diag 6119 real(gwp),allocatable :: eig(:) 6120 real(dp),allocatable :: kin_qg(:), ec_rpa(:), ec_mp2(:), ecut_chi(:) 6121 6122 ! ************************************************************************* 6123 6124 call gwr%build_chi0_head_and_wings() 6125 call gwr%build_green(free_ugb=.True.) 6126 call gwr%build_tchi() 6127 6128 ! Compute RPA energy for ncut cutoff energies in order to extrapolate for ecuteps --> oo 6129 ! See also calc_rpa_functional in m_screening_driver 6130 ncut = max(1, gwr%dtset%gwr_rpa_ncut) ! Usually 5 6131 estep = -gwr%dtset%ecuteps * 0.05_dp 6132 6133 call cwtime(cpu_all, wall_all, gflops_all, "start") 6134 call timab(1928, 1, tsec) 6135 call wrtout(std_out, sjoin(" Begin computation of RPA energy with gwr_rpa_ncut:", itoa(ncut), " ...")) 6136 ABI_CHECK(gwr%tchi_space == "iomega", sjoin("tchi_space:", gwr%tchi_space, "!= iomega")) 6137 6138 ABI_CALLOC(ec_rpa, (ncut)) 6139 ABI_CALLOC(ec_mp2, (ncut)) 6140 ABI_MALLOC(ecut_chi, (ncut)) 6141 ecut_chi = arth(gwr%dtset%ecuteps + tol12, estep, ncut) 6142 6143 ! Polarizability has been summed over spins inside build_tchi. 6144 ! The loop over spins is needed to parallelize the loop over my_iqi if nsppol == 2. 6145 do my_is=1,gwr%my_nspins 6146 spin = gwr%my_spins(my_is) 6147 if (gwr%spin_comm%nproc == 1 .and. spin == 2) cycle 6148 6149 do my_iqi=1,gwr%my_nqibz 6150 if (gwr%spin_comm%skip(my_iqi)) cycle 6151 print_time = gwr%comm%me == 0 .and. (my_iqi < LOG_MODK .or. mod(my_iqi, LOG_MODK) == 0) 6152 if (print_time) call cwtime(cpu_q, wall_q, gflops_q, "start") 6153 6154 iq_ibz = gwr%my_qibz_inds(my_iqi); qq_ibz = gwr%qibz(:, iq_ibz) 6155 q_is_gamma = normv(qq_ibz, gwr%cryst%gmet, "G") < GW_TOLQ0 6156 !if (q_is_gamma) then 6157 ! call wrtout([std_out, ab_out], "RPA: Ignoring q==0"); cycle 6158 !end if 6159 6160 ! iq_ibz might be replicated inside gwr%kpt_comm. 6161 if (.not. gwr%itreat_iqibz(iq_ibz)) cycle 6162 6163 desc_q => gwr%tchi_desc_qibz(iq_ibz) 6164 ABI_CHECK(desc_q%kin_sorted, "g-vectors are not sorted by |q+g|^2/2 !") 6165 npw_q = desc_q%npw; ig0 = desc_q%ig0 6166 6167 ABI_MALLOC(kin_qg, (npw_q)) 6168 do ig=1,npw_q 6169 kin_qg(ig) = half * normv(qq_ibz + desc_q%gvec(:,ig), gwr%cryst%gmet, "G") ** 2 6170 end do 6171 6172 do my_it=1,gwr%my_ntau 6173 itau = gwr%my_itaus(my_it) 6174 associate (tchi => gwr%tchi_qibz(iq_ibz, itau, spin)) 6175 if (my_it == 1) then 6176 ! Allocate workspace. NB: npw_q is the total number of PWs for this q. 6177 call tchi%copy(chi_tmp) 6178 !ABI_CHECK_IEQ(npw_q, tchi%sizeb_global(1), "npw_q") 6179 ABI_MALLOC(eig, (npw_q)) 6180 end if 6181 6182 do icut=1,ncut 6183 call cwtime(cpu_cut, wall_cut, gflops_cut, "start") 6184 6185 ! Damp Coulomb kernel in order to have smooth E(V). 6186 ! See also https://www.vasp.at/wiki/index.php/ENCUTGWSOFT 6187 ! and Harl's PhD thesis available at: https://utheses.univie.ac.at/detail/2259 6188 ecut_soft = 0.8_dp * ecut_chi(icut) 6189 6190 ! TODO: Contribution due to the head for q --> 0 is ignored. 6191 ! This is not optimal but consistent with calc_rpa_functional 6192 do il_g2=1,tchi%sizeb_local(2) 6193 !ig2 = mod(tchi%loc2gcol(il_g2) - 1, desc_q%npw) + 1 6194 ig2 = tchi%loc2gcol(il_g2) 6195 damp = one 6196 !if (kin_qg(ig2) > ecut_soft) then 6197 ! damp = sqrt(half * (one + cos(pi * (kin_qg(ig2) - ecut_soft) / (ecut_chi(icut) - ecut_soft)))) 6198 !end if 6199 vcs_g2 = desc_q%vc_sqrt(ig2) * damp 6200 if (q_is_gamma .and. ig2 == ig0) vcs_g2 = zero 6201 6202 do il_g1=1,tchi%sizeb_local(1) 6203 !ig1 = mod(tchi%loc2grow(il_g1) - 1, desc_q%npw) + 1 6204 ig1 = tchi%loc2grow(il_g1) 6205 damp = one 6206 !if (kin_qg(ig1) > ecut_soft) then 6207 ! damp = sqrt(half * (one + cos(pi * (kin_qg(ig1) - ecut_soft) / (ecut_chi(icut) - ecut_soft)))) 6208 !end if 6209 vcs_g1 = desc_q%vc_sqrt(ig1) * damp 6210 if (q_is_gamma .and. ig1 == ig0) vcs_g1 = zero 6211 6212 chi_tmp%buffer_cplx(il_g1, il_g2) = tchi%buffer_cplx(il_g1, il_g2) * vcs_g1 * vcs_g2 6213 end do 6214 end do 6215 6216 ! Diagonalize sub-matrix and perform integration in imaginary frequency. 6217 ! Eq (6) in 10.1103/PhysRevB.81.115126 6218 ! NB: have to build chi_tmp inside loop over icut as matrix is destroyed by pzheev. 6219 mat_size = bisect(kin_qg, ecut_chi(icut)) 6220 6221 ! Change size block and, if possible, use 2D rectangular grid of processors for diagonalization 6222 call proc_4diag%init(chi_tmp%processor%comm) 6223 call chi_tmp%change_size_blocs(chi_4diag, processor=proc_4diag) 6224 call chi_4diag%heev("N", "U", dummy_vec, eig, mat_size=mat_size) 6225 call chi_4diag%free() 6226 call proc_4diag%free() 6227 6228 ! TODO: ELPA 6229 !call compute_eigen_problem(processor, matrix, results, eigen, comm, istwf_k, nev) 6230 6231 if (xmpi_comm_rank(chi_tmp%processor%comm) == 0) then 6232 weight = gwr%wtq(iq_ibz) * gwr%iw_wgs(itau) / two_pi 6233 do ii=1,mat_size 6234 ec_rpa(icut) = ec_rpa(icut) + weight * (log(one - eig(ii)) + eig(ii)) 6235 ! second order Moeller Plesset. 6236 ec_mp2(icut) = ec_mp2(icut) - weight * eig(ii) ** 2 / two 6237 !if (eig(ii) > zero) then 6238 ! write(msg, "(a, es16.8)")"Positive eigenvalue:", eig(ii) 6239 ! ABI_ERROR(msg) 6240 !end if 6241 end do 6242 end if 6243 6244 write(msg,'(4x,2(a,i0),a)')"icut [", icut, "/", ncut, "]" 6245 call cwtime_report(msg, cpu_cut, wall_cut, gflops_cut) 6246 end do ! icut 6247 6248 if (my_it == gwr%my_ntau) then 6249 ! Free workspace 6250 call chi_tmp%free() 6251 ABI_FREE(eig) 6252 end if 6253 end associate 6254 end do ! my_it 6255 6256 ABI_FREE(kin_qg) 6257 if (print_time) then 6258 write(msg,'(4x,2(a,i0),a)')"My iqi [", my_iqi, "/", gwr%my_nqibz, "]" 6259 call cwtime_report(msg, cpu_q, wall_q, gflops_q) 6260 end if 6261 end do ! my_iqi 6262 end do ! my_is 6263 6264 ! Collect results on the master node. 6265 call xmpi_sum_master(ec_rpa, master, gwr%comm%value, ierr) 6266 call xmpi_sum_master(ec_mp2, master, gwr%comm%value, ierr) 6267 6268 if (gwr%comm%me == master) then 6269 ! Print results to ab_out. 6270 ! TODO: Add metadata: nband, nqbz... 6271 write(ab_out, "(4a16)")"ecut_chi", "ecut_chi^(-3/2)", "RPA Ec (eV)", "RPA Ec (Ha)" 6272 do icut=ncut,1,-1 6273 write(ab_out, "(*(es16.8))") ecut_chi(icut), ecut_chi(icut) ** (-three/two), ec_rpa(icut) * Ha_eV, ec_rpa(icut) 6274 end do 6275 if (ncut > 1) then 6276 ! Add last line with extrapolated value. 6277 rmsq = linfit(ncut, ecut_chi(:) ** (-three/two), ec_rpa, aa, bb) 6278 write(ab_out, "(2a16,*(es16.8))") "oo", "0", bb * Ha_eV, bb 6279 end if 6280 6281 ! ====================== 6282 ! Add results to GWR.nc 6283 ! ====================== 6284 NCF_CHECK(nctk_open_modify(ncid, gwr%gwrnc_path, xmpi_comm_self)) 6285 ncerr = nctk_def_dims(ncid, [nctkdim_t("ncut", ncut)], defmode=.True.) 6286 NCF_CHECK(ncerr) 6287 6288 ncerr = nctk_def_arrays(ncid, [ & 6289 nctkarr_t("ecut_chi", "dp", "ncut"), & 6290 nctkarr_t("ec_rpa_ecut", "dp", "ncut"), & 6291 nctkarr_t("ec_mp2_ecut", "dp", "ncut") & 6292 ]) 6293 NCF_CHECK(ncerr) 6294 6295 ! Write data. 6296 NCF_CHECK(nctk_set_datamode(ncid)) 6297 NCF_CHECK(nf90_put_var(ncid, vid("ecut_chi"), ecut_chi)) 6298 NCF_CHECK(nf90_put_var(ncid, vid("ec_rpa_ecut"), ec_rpa)) 6299 NCF_CHECK(nf90_put_var(ncid, vid("ec_mp2_ecut"), ec_mp2)) 6300 end if ! master 6301 6302 ABI_FREE(ec_rpa) 6303 ABI_FREE(ec_mp2) 6304 ABI_FREE(ecut_chi) 6305 6306 call cwtime_report(" gwr_rpa_energy:", cpu_all, wall_all, gflops_all) 6307 call timab(1928, 2, tsec) 6308 6309 contains 6310 integer function vid(vname) 6311 character(len=*),intent(in) :: vname 6312 vid = nctk_idname(ncid, vname) 6313 end function vid 6314 6315 end subroutine gwr_rpa_energy
m_gwr/gwr_rpr_to_ggp [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_rpr_to_ggp
FUNCTION
Helper function to FFT a two-point function: F_{r',r} --> F_{g,g'} Note that rp_r is destroyed in output
INPUTS
OUTPUT
SOURCE
3155 subroutine gwr_rpr_to_ggp(gwr, desc, rp_r, g_gp) 3156 3157 !Arguments ------------------------------------ 3158 class(gwr_t),intent(in) :: gwr 3159 type(desc_t),intent(in) :: desc 3160 class(__slkmat_t),intent(inout) :: rp_r, g_gp 3161 3162 !Local variables------------------------------- 3163 integer :: ig2, npwsp, nrsp, col_bsize, ir2, ndat, isign 3164 type(__slkmat_t) :: r_gp, gp_r 3165 character(len=500) :: msg 3166 type(uplan_t) :: uplan_k 3167 3168 ! ************************************************************************* 3169 3170 ! Allocate intermediate gp_r PBLAS matrix to store F(g',r) 3171 npwsp = desc%npw * gwr%nspinor; nrsp = gwr%g_nfft * gwr%nspinor 3172 ABI_CHECK(block_dist_1d(nrsp, gwr%g_comm%nproc, col_bsize, msg), msg) 3173 3174 call gp_r%init(npwsp, nrsp, gwr%g_slkproc, desc%istwfk, size_blocs=[-1, col_bsize]) 3175 3176 call uplan_k%init(desc%npw, gwr%nspinor, gwr%uc_batch_size, gwr%g_ngfft, desc%istwfk, & 3177 desc%gvec, gwpc, gwr%dtset%gpu_option) 3178 3179 isign = +1 ! This should be ok 3180 !isign = -1 3181 3182 ! F(r',r) --> F(g',r) and store results in gp_r. 3183 do ir2=1, rp_r%sizeb_local(2), gwr%uc_batch_size 3184 ndat = blocked_loop(ir2, rp_r%sizeb_local(2), gwr%uc_batch_size) 3185 call uplan_k%execute_rg(ndat, rp_r%buffer_cplx(:,ir2), gp_r%buffer_cplx(:,ir2), isign=isign, iscale=0) ! this should be OK 3186 end do 3187 3188 ! F(g',r) --> F(r,g') 3189 !call gp_r%ptrans("N", r_gp, free=.True.) 3190 ! FIXME: I don't know why by C is needed here. 3191 call gp_r%ptrans("C", r_gp, free=.True.) 3192 3193 ! F(r,g') --> F(g,g') and store results in g_gp. 3194 do ig2=1, g_gp%sizeb_local(2), gwr%uc_batch_size 3195 ndat = blocked_loop(ig2, g_gp%sizeb_local(2), gwr%uc_batch_size) 3196 call uplan_k%execute_rg(ndat, r_gp%buffer_cplx(:,ig2), g_gp%buffer_cplx(:,ig2), isign=-isign, iscale=0) ! this should be OK 3197 end do 3198 3199 call uplan_k%free(); call r_gp%free() 3200 3201 end subroutine gwr_rpr_to_ggp
m_gwr/gwr_run_chi0 [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_run_chi0
FUNCTION
Driver to compute CHI0 along the imaginary axis.
INPUTS
[free_ugb]: True if array with empty KS states should freed as soon as possibile. Default: True
OUTPUT
SOURCE
6373 subroutine gwr_run_chi0(gwr, free_ugb) 6374 6375 !Arguments ------------------------------------ 6376 class(gwr_t),intent(inout) :: gwr 6377 logical,optional,intent(in) :: free_ugb 6378 6379 !Local variables------------------------------- 6380 logical :: free_ugb__ 6381 6382 ! ************************************************************************* 6383 6384 ! Use ugb wavefunctions and the Lehmann representation to compute head/wings and Sigma_x matrix elements. 6385 call gwr%build_chi0_head_and_wings() 6386 6387 ! Now compute G(itau) from ugb and start the GWR algorithm. 6388 free_ugb__ = .True.; if (present(free_ugb)) free_ugb__ = free_ugb 6389 call gwr%build_green(free_ugb=free_ugb__) 6390 call gwr%build_tchi() 6391 6392 end subroutine gwr_run_chi0
m_gwr/gwr_run_energy_scf [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_run_energy_scf
FUNCTION
Compute QP energies within energy-only self-consistent GW approximation and minimax meshes along the imaginary axis.
INPUTS
OUTPUT
SOURCE
6411 subroutine gwr_run_energy_scf(gwr) 6412 6413 !Arguments ------------------------------------ 6414 class(gwr_t),intent(inout) :: gwr 6415 6416 !Local variables------------------------------- 6417 integer,parameter :: master = 0 6418 integer :: units(2) 6419 logical :: converged 6420 character(len=500) :: msg 6421 6422 ! ************************************************************************* 6423 6424 ! TODO: 6425 ! To implement restart capabilities we need to read scf_iteration, qp_ebands and gwr_task from GWR.nc 6426 ! build_sigmac should be responsible for writing checkpoint data with qp_ebands at each iteration. 6427 units = [std_out, ab_out] 6428 6429 ABI_CHECK_IEQ(gwr%nkcalc, gwr%nkibz, "For energy-only GW, one should include all k-points in the IBZ") 6430 6431 select case (gwr%dtset%gwr_task) 6432 case ("EGEW") 6433 converged = .False. 6434 call wrtout(units, " Begin energy-only self-consistency in both G and W (EGEW)") 6435 do while (.not. converged .and. gwr%scf_iteration <= gwr%dtset%gwr_nstep) 6436 call gwr%run_g0w0(free_ugb=.False.) 6437 gwr%scf_iteration = gwr%scf_iteration + 1 6438 call gwr%check_scf_cycle(converged) 6439 end do 6440 6441 case ("EGW0") 6442 call wrtout(units, " Begin energy-only self-consistency in G (EGW0)") 6443 call gwr%run_g0w0(free_ugb=.False.) 6444 converged = .False. 6445 do while (.not. converged .and. gwr%scf_iteration <= gwr%dtset%gwr_nstep) 6446 gwr%scf_iteration = gwr%scf_iteration + 1 6447 call gwr%build_green(free_ugb=.False.) 6448 call gwr%build_sigxme() ! NB: This should not change in semiconductors 6449 call gwr%build_sigmac() 6450 call gwr%check_scf_cycle(converged) 6451 end do 6452 6453 case ("G0EW") 6454 ! This is more difficult to implement as we need to store G0 and eG 6455 ! and then use G only for chi and not in Sigma 6456 call wrtout(units, " Begin energy-only self-consistency in W (G0EW)") 6457 ABI_ERROR("G0WE is not yet implemented") 6458 call gwr%run_g0w0(free_ugb=.False.) 6459 converged = .False. 6460 do while (.not. converged .and. gwr%scf_iteration <= gwr%dtset%gwr_nstep) 6461 gwr%scf_iteration = gwr%scf_iteration + 1 6462 !call gwr%build_green(free_ugb=.False.) 6463 call gwr%build_chi0_head_and_wings() 6464 call gwr%build_tchi() 6465 call gwr%build_wc() 6466 call gwr%build_sigmac() 6467 call gwr%check_scf_cycle(converged) 6468 end do 6469 6470 case default 6471 ABI_ERROR(sjoin("Invalid gwr_task:", gwr%dtset%gwr_task)) 6472 end select 6473 6474 if (gwr%comm%me == master) then 6475 if (converged) then 6476 write(msg, "(1x,4a,i0,a,f8.3,a)") & 6477 trim(gwr%dtset%gwr_task), " self-consistent loop:", ch10, & 6478 " Convergence achieved at iteration: ", gwr%scf_iteration, & 6479 " with gwr_tolqpe: ",gwr%dtset%gwr_tolqpe * Ha_meV, " (meV)" 6480 call wrtout(units, msg) 6481 else 6482 write(msg, "(1x,4a,f8.3,3a,i0,a)") & 6483 trim(gwr%dtset%gwr_task), " self-consistent loop:", ch10, & 6484 " WARNING: Could not converge with gwr_tolqpe: ",gwr%dtset%gwr_tolqpe * Ha_meV, " (meV)", ch10, & 6485 " after: ", gwr%dtset%gwr_nstep, " steps" 6486 call wrtout(units, msg) 6487 end if 6488 end if 6489 6490 end subroutine gwr_run_energy_scf
m_gwr/gwr_run_g0w0 [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_run_g0w0
FUNCTION
Driver to compute QP energies within the G0W0 approximation and minimax meshes along the imaginary axis.
INPUTS
[free_ugb]: True if array with empty KS states should freed as soon as possibile. Default: True
OUTPUT
SOURCE
6334 subroutine gwr_run_g0w0(gwr, free_ugb) 6335 6336 !Arguments ------------------------------------ 6337 class(gwr_t),intent(inout) :: gwr 6338 logical,optional,intent(in) :: free_ugb 6339 6340 !Local variables------------------------------- 6341 logical :: free_ugb__ 6342 6343 ! ************************************************************************* 6344 6345 ! Use ugb wavefunctions and the Lehmann representation to compute head/wings and Sigma_x matrix elements. 6346 call gwr%build_chi0_head_and_wings() 6347 call gwr%build_sigxme() 6348 6349 ! Now compute G(itau) from ugb and start the GWR algorithm. 6350 free_ugb__ = .True.; if (present(free_ugb)) free_ugb__ = free_ugb 6351 call gwr%build_green(free_ugb=free_ugb__) 6352 call gwr%build_tchi() 6353 call gwr%build_wc() 6354 call gwr%build_sigmac() 6355 6356 end subroutine gwr_run_g0w0
m_gwr/gwr_t [ Types ]
NAME
gwr_t
FUNCTION
This object provides the high-level API used to perform the different steps of the GWR algorithm.
SOURCE
301 type, public :: gwr_t 302 303 integer :: nsppol = 1, nspinor = -1, nsig_ab = -1, nspden = -1 304 ! Number of independent spin polarizations, number of spinor components and spin densities. 305 306 integer :: natom = -1 307 ! Number of atoms 308 309 integer :: usepaw = -1 310 ! 0 if NC pseudos. 1 if PAW is used (not yet supported). 311 312 integer :: my_nspins = -1 313 ! Number of independent spin polarizations treated by this MPI proc 314 315 integer :: nkbz = -1, nkibz = -1 316 ! Number of k-points in the BZ/IBZ 317 318 integer :: my_nkibz = -1, my_nkbz = -1 319 ! Number of k-points in the IBZ/BZ stored by this MPI proc. 320 321 integer :: uc_batch_size = -1 322 ! Max number of unit cell FFT-transforms done in batch mode. 323 324 integer :: sc_batch_size = -1 325 ! Max number of supercell-cell FFT-transforms done in batch mode. 326 327 integer,allocatable :: my_kbz_inds(:) 328 ! (my_nkbz) 329 ! List of k-BZ indices treated by this proc. 330 331 integer,allocatable :: my_kibz_inds(:) 332 ! (my_nkibz) 333 ! List of k-IBZ indices treated by this proc. 334 335 integer :: nqbz = -1, nqibz = -1 336 ! Number of q-points in the BZ/IBZ 337 338 integer :: my_nqibz = -1, my_nqbz = -1 339 ! Number of q-points in the IBZ/BZ stored by this MPI proc. 340 341 integer,allocatable :: my_qibz_inds(:) 342 ! (my_nqibz) 343 ! List of q-IBZ indices treated by this proc. 344 345 integer,allocatable :: my_qbz_inds(:) 346 ! (my_nqbz) 347 ! List of q-IBZ indices treated by this proc. 348 349 integer :: ntau = -1 350 ! Total number of imaginary time points. 351 352 integer :: my_ntau = -1 353 ! Number of imaginary time/frequency points treated by this MPI rank. 354 355 integer :: nkcalc 356 ! Number of Sigma_nk k-points computed 357 ! TODO: Should be spin dependent + max_nkcalc 358 359 integer :: max_nbcalc 360 ! Maximum number of bands computed (max over nkcalc and spin). 361 362 integer :: nwr = -1 363 ! Number of frequency points along the real axis for Sigma(w) and spectral function A(w) 364 ! Odd number so that the mesh is centered on the KS energy. 365 366 !real(dp) :: i_sz = huge(one) 367 ! Value of the integration of the Coulomb singularity 4\pi/V_BZ \int_BZ d^3q 1/q^2 368 369 real(dp) :: wr_step = -one 370 ! Step of the linear mesh along the real axis (Ha units). 371 372 real(dp) :: q0(3) = GW_Q0_DEFAULT 373 ! The small q for the treatment of q --> 0 374 375 real(dp),allocatable :: kcalc(:,:) 376 ! kcalc(3, nkcalc) 377 ! List of k-points where the self-energy is computed. 378 379 logical :: idle_proc = .False. 380 ! True if there are idle procs i.e. if processes in the input_comm have been excluded. 381 382 !logical :: use_shmem_for_k = .False. 383 !logical :: use_mpi_for_k = .False. 384 385 integer,allocatable :: bstart_ks(:,:) 386 ! bstart_ks(nkcalc, nsppol) 387 ! Initial KS band index included in self-energy matrix elements for each k-point in kcalc. 388 ! Depends on spin because all degenerate states should be included when symmetries are used. 389 390 integer,allocatable :: bstop_ks(:,:) 391 ! bstop_ks(nkcalc, nsppol) 392 393 integer,allocatable :: nbcalc_ks(:,:) 394 ! nbcalc_ks(nkcalc, nsppol) 395 ! Number of bands included in self-energy matrix elements for each k-point in kcalc. 396 ! Depends on spin because all degenerate states should be included when symmetries are used. 397 398 integer,allocatable :: kcalc2ibz(:,:) 399 !kcalc2ibz(nkcalc, 6)) 400 ! Mapping ikcalc --> IBZ as reported by listkk. 401 402 logical :: use_supercell_for_tchi = .True. 403 ! True if we are using the supercell formalism for tchi 404 ! False if we are using the mixed-space approach with convolutions in k-space. 405 406 logical :: use_supercell_for_sigma = .True. 407 ! True if we are using the supercell formalism for sigma 408 ! False if we are using the mixed-space approach with convolutions in k-space. 409 410 integer :: ngkpt(3) = -1, ngqpt(3) = -1 411 ! Number of divisions in k/q meshes. 412 413 integer,allocatable :: my_spins(:) 414 ! (my_nspins) 415 ! Indirect table giving the spin indices treated by this MPI rank. 416 ! Used only in the collinear case with nsppol = 2. 417 418 integer,allocatable :: my_itaus(:) 419 ! (my_ntau) 420 ! Indirect table giving the tau indices treated by this MPI rank. 421 422 integer,allocatable :: tau_master(:) 423 ! (ntau) 424 ! The rank of the MPI proc in tau_comm treating itau. 425 426 integer, allocatable :: np_qibz(:) 427 ! (nqibz) 428 ! Number of processors in kpt_comm treating iq_ibz 429 430 integer, allocatable :: np_kibz(:) 431 ! (nkibz) 432 ! Number of processors in kpt_comm treating ik_ibz 433 434 logical, allocatable :: itreat_ikibz(:) 435 ! (nkibz) 436 ! True if this MPI rank treat ik_ibz 437 438 logical, allocatable :: itreat_iqibz(:) 439 ! (nqibz) 440 ! True if this MPI rank treats iq_ibz 441 442 real(dp),allocatable :: tau_mesh(:), tau_wgs(:) 443 ! (ntau) 444 ! Imaginary tau mesh and integration weights. 445 446 real(dp),allocatable :: iw_mesh(:), iw_wgs(:) 447 ! (ntau) 448 ! Imaginary frequency mesh and integration weights 449 450 real(dp),allocatable :: cosft_wt(:,:) 451 ! (ntau, ntau) 452 ! weights for cosine transform. (i tau --> i omega) 453 454 real(dp),allocatable :: cosft_tw(:,:) 455 ! (ntau, ntau) 456 ! weights for sine transform (i iomega --> i tau) 457 458 real(dp),allocatable :: sinft_wt(:,:) 459 ! (ntau, ntau) 460 ! weights for sine transform (i tau --> i omega) 461 462 real(dp) :: te_min = -one, te_max = one 463 ! min and Max transition energy in Ha. 464 465 real(dp) :: ft_max_error(3) = -one 466 ! Max error due to inhomogenous FT. 467 468 real(dp) :: cosft_duality_error = -one 469 ! Max_{ij} |CT CT^{-1} - I| 470 471 integer :: green_mpw = -1 472 ! Max number of g-vectors for Green's function over k-points. 473 474 integer :: tchi_mpw = -1 475 ! Max number of g-vectors for tchi over q-points. 476 477 !integer :: sigma_mpw = -1 478 ! Max number of g-vectors for Sigma over q-points. 479 480 integer :: g_ngfft(18) = -1, g_mgfft = -1, g_nfft = -1 481 ! FFT mesh for the Green's function. 482 483 !integer :: chi_ngfft(18) = -1, chi_mgfft = -1, chi_nfft = -1 484 !integer :: sig_ngfft(18) = -1, sig_mgfft = -1, sig_nfft = -1 485 486 integer :: mg0(3) = [2, 2, 2] 487 ! Max shifts to account for umklapps. 488 489 type(desc_t),allocatable :: green_desc_kibz(:) 490 ! (nkibz) 491 ! Descriptor for Green's functions 492 493 type(desc_t),allocatable :: tchi_desc_qibz(:) 494 ! (nqibz) 495 ! Descriptor for tchi. NB: The g-vectors are sorted by |q+g|^2/2 496 497 integer,allocatable :: chinpw_qibz(:) 498 ! Number of PWs in tchi for each q-point in the IBZ (available on all procs) 499 500 !type(desc_t),allocatable :: sigma_desc_kibz(:) 501 ! (nkibz) 502 ! Descriptor for self-energy 503 504 integer :: coords_stgk(4) = 0 505 ! Cartesian coordinates of this processor in the Cartesian grid. 506 507 type(xcomm_t) :: comm 508 ! Communicator with all MPI procs involved in the computation 509 ! NB: gwr%comm%value is not necessarly the same as the input_comm 510 ! we may decide to remove some procs from input_comm before createring the Cartesian grid. 511 512 type(xcomm_t) :: spin_comm 513 ! MPI communicator over spins. 514 515 type(xcomm_t) :: kpt_comm 516 ! MPI communicator for k/q-point distribution. 517 518 type(xcomm_t) :: g_comm 519 ! MPI communicator for g/r distribution 520 521 type(xcomm_t) :: tau_comm 522 ! MPI communicator for imag time distribution 523 524 type(xcomm_t) :: gtau_comm 525 ! MPI communicator for g/tau 2D subgrid. 526 527 type(xcomm_t) :: kg_comm 528 ! MPI communicator for g/g 2D subgrid. 529 530 type(xcomm_t) :: kts_comm 531 ! MPI communicator for tau/kpoint/spin 3D grid 532 533 type(xcomm_t) :: kgt_comm 534 ! MPI communicator for g/tau/kpoint 3D grid 535 536 type(dataset_type), pointer :: dtset => null() 537 ! Input variables. 538 539 type(datafiles_type), pointer :: dtfil => null() 540 ! Names of input/output files and prefixes. 541 542 type(crystal_t), pointer :: cryst => null() 543 ! Crystal structure. 544 545 integer :: scf_iteration = 1 546 ! Internal counter used to implement self-consistency 547 ! For the time being, only self-consistency in energies is supported. 548 549 integer,allocatable :: ks_vbik(:,:) 550 ! (gwr%ks_ebands%nkpt, gwr%ks_ebands%nsppol) 551 ! KS valence band indices. 552 553 type(ebands_t), pointer :: ks_ebands => null() 554 ! initial KS energies 555 556 type(gaps_t) :: ks_gaps 557 ! Info on the KS gaps. 558 559 type(ebands_t) :: qp_ebands 560 ! QP energies 561 562 type(ebands_t) :: qp_ebands_prev 563 ! QP energies of the previous iteration. Used if self-consistency. 564 565 type(pseudopotential_type), pointer :: psps => null() 566 ! NC Pseudos data 567 568 type(pawtab_type), pointer :: pawtab(:) => null() 569 ! PAW data 570 571 type(mpi_type),pointer :: mpi_enreg => null() 572 ! Sequential mpi_type needed to invoke ABINIT routines requiring it. 573 574 type(processor_scalapack) :: g_slkproc 575 ! 1D PBLAS grid to block-distribute matrices along columns inside gcomm. 576 577 type(__slkmat_t),allocatable :: gt_kibz(:,:,:,:) 578 ! (2, nkibz, ntau, nsppol) 579 ! Occupied/Empty Green's function G_k(g,g') 580 581 type(__slkmat_t),allocatable :: tchi_qibz(:,:,:) 582 ! (nqibz, ntau, nsppol) 583 ! Irreducible polarizability tchi_q(g,g') 584 585 character(len=10) :: tchi_space = "none" 586 ! "none", "itau", "iomega" 587 588 type(__slkmat_t),allocatable :: wc_qibz(:,:,:) 589 ! (nqibz, ntau, nsppol) 590 ! Correlated screened Coulomb interaction summed over collinear spins 591 ! Replicated across spin_comm if nsppol == 2. 592 593 character(len=10) :: wc_space = "none" 594 ! "none", "itau", "iomega" 595 596 !type(__slkmat_t),allocatable :: em1_qibz(:,:,:) 597 ! Inverse dielectric matrix at omega = 0 598 ! (nqibz, nsppol) 599 ! Replicated across the tau comm and the spin comm if nsppol == 2. 600 601 type(__slkmat_t),allocatable :: sigc_kibz(:,:,:,:) 602 ! (2, nkibz, ntau, nsppol) 603 604 character(len=10) :: sigc_space = "none" 605 ! "none", "itau", "iomega" 606 607 type(__slkmat_t),allocatable :: ugb(:,:) !, nato_ugb(:,:) 608 ! (nkibz, nsppol) 609 ! Fourier components of the KS wavefunctions stored in a PBLAS matrix 610 ! Bands are distributed in the g_comm communicator in a round-robin fashion. 611 ! hence they are REPLICATED over tau_comm as this leads to better scalability in terms of flops. 612 ! Distributing bands inside the 2D gtau_comm, indeed, allows one to reduce memory further 613 ! but then the pzgemm used to build G explodes. Also. tau parallelism is high-level in GWR so it's not a good idea 614 ! to mix it with low-level just to make memory for ugb scale better. 615 ! The size of ugb is negligible when compared to G and Chi. 616 617 type(processor_scalapack) :: gtau_slkproc 618 ! Scalapack grid with (g,tau) processors 619 620 integer :: ugb_nband = -1 621 ! Number of bands in ugb. 622 623 type(vcgen_t) :: vcgen 624 ! Object used to compute Coulomb term vc(q,g) 625 626 character(len=fnlen) :: gwrnc_path = ABI_NOFILE 627 ! Path to the GWR.nc file with output results. 628 629 real(dp),allocatable :: kbz(:,:) 630 ! (3, nkbz) 631 ! Reduced coordinates of the k-points in the full BZ. 632 633 real(dp), contiguous, pointer :: kibz(:,:) => null() 634 ! (3, nkibz) 635 ! Reduced coordinates of the k-points in the IBZ 636 637 integer,allocatable :: kbz2ibz(:,:) 638 ! (6, nkbz) 639 ! Mapping kBZ to IBZ (symrec conventions) 640 641 integer,allocatable :: kbz2ibz_symrel(:,:) 642 ! (6, nkbz) 643 ! Mapping kBZ to IBZ (symrel conventions) TODO: To be removed 644 645 real(dp), contiguous, pointer :: wtk(:) => null() 646 ! (nkibz) 647 ! Weights of the k-points in the IBZ (normalized to one). 648 649 real(dp),allocatable :: qbz(:,:) 650 ! (3, nqbz) 651 ! Reduced coordinates of the q-points in the full BZ. 652 653 integer,allocatable :: qbz2ibz(:,:) 654 ! (6, nqbz) 655 ! Mapping qBZ to IBZ (symrec conventions) 656 657 real(dp),allocatable :: qibz(:,:) 658 ! (3, nqibz) 659 ! Reduced coordinates of the q-points in the IBZ (full symmetry of the system). 660 661 real(dp),allocatable :: wtq(:) 662 ! (nqibz) 663 ! Weights of the q-points in the IBZ (normalized to one). 664 665 complex(dp),allocatable :: chi0_head_myw(:,:,:) 666 ! (3,3,my_ntau) 667 ! Head of the irred. polarizability in i.omega space. 668 ! Note that spins have been summed over. 669 670 complex(dp),allocatable :: chi0_uwing_myw(:,:,:), chi0_lwing_myw(:,:,:) 671 ! (3, npw_chi_gamma, my_ntau) 672 ! Upper wings of the irred. polarizability in i omega space. 673 ! Note that spins have been summed over. 674 675 type(wfdgw_t) :: kcalc_wfd 676 ! wavefunction descriptor with the KS states where QP corrections are wanted. 677 678 type(hdr_type) :: wfk_hdr 679 ! header of the WFK file 680 681 type(melements_t) :: ks_me !, qp_me 682 ! Matrix elements of the different potentials in the KS basis set. 683 684 type(degtab_t),allocatable :: degtab(:,:) 685 ! (nkcalc, nsppol) 686 ! Table used to average QP results in the degenerate subspace if symsigma == 1 687 688 integer :: b1gw = -1, b2gw = -1 689 ! b1gw = minval(gwr%bstart_ks); b2gw = maxval(gwr%bstop_ks) 690 691 logical :: sig_diago 692 ! True if Sigma_ matrices are diagonal in the band indices 693 694 complex(dp),allocatable :: sigx_mat(:,:,:,:) 695 ! (b1gw:b2gw, ?, nkcalc, nsppol*nsig_ab) 696 ! Matrix elements of <i|\Sigma_x|j>. The second dimension depends on sig_diago 697 698 !complex(dp),allocatable :: sigc_it_mat(:,:,:,:,:) 699 ! (2, ntau, max_nbcalc, nkcalc, nsppol*nsig_ab)) 700 ! Matrix elements of <i|\Sigma_c(itau)|j>. The second dimension depends on sig_diago 701 702 complex(dp),allocatable :: sigc_iw_mat(:,:,:,:,:) 703 ! Matrix elements of <i|\Sigma_c(i omega)|j> 704 ! (ntau, b1gw:b2gw, ?, nkcalc, nsppol*nsig_ab). The second dimension depends on sig_diago 705 706 type(pstat_t) :: pstat 707 ! Interface to the /proc/{pid}/status file. 708 709 contains 710 711 procedure :: init => gwr_init 712 ! Initialize the object. 713 714 procedure :: rotate_gpm => gwr_rotate_gpm 715 ! Reconstruct the Green's functions in the BZ from the IBZ. 716 717 procedure :: gk_to_scbox => gwr_gk_to_scbox 718 719 procedure :: wcq_to_scbox => gwr_wcq_to_scbox 720 721 procedure :: get_myk_green_gpr => gwr_get_myk_green_gpr 722 ! G_k(g,g') --> G_k(g',r) for each k in the BZ treated by this MPI proc for given spin and tau. 723 724 procedure :: get_gkbz_rpr_pm => gwr_get_gkbz_rpr_pm 725 ! Compute G_k(r',r) with (r, r') in the unit cell and k in the full BZ. 726 727 procedure :: rotate_wc => gwr_rotate_wc 728 ! Reconstruct Wc(q) in the BZ from the IBZ. 729 730 procedure :: get_myq_wc_gpr => gwr_get_myq_wc_gpr 731 ! W_q(g,g') --> W_q(g',r) for each q in the BZ treated by this MPI procs for given spin and tau. 732 733 procedure :: get_wc_rpr_qbz => gwr_get_wc_rpr_qbz 734 ! Compute Wc_q(r',r') with q in the BZ 735 736 procedure :: cos_transform => gwr_cos_transform 737 ! Inhomogeneous cosine transform. 738 739 procedure :: malloc_free_mats => gwr_malloc_free_mats 740 ! Allocate/Deallocate matrices for G/tchi/Sigma 741 742 procedure :: free => gwr_free 743 ! Free memory. 744 745 procedure :: print => gwr_print 746 ! Print info on the object. 747 748 procedure :: print_mem => gwr_print_mem 749 ! Print memory required by PBLAS matrices. 750 751 procedure :: print_trace => gwr_print_trace 752 ! Print trace of matrices for testing purposes. 753 754 procedure :: load_kcalc_wfd => gwr_load_kcalc_wfd 755 ! Load the KS states for Sigma_nk from the WFK file 756 757 procedure :: read_ugb_from_wfk => gwr_read_ugb_from_wfk 758 ! Read wavefunctions from WFK file. 759 760 procedure :: build_green => gwr_build_green 761 ! Build Green's functions in imaginary time from the gwr%ugb matrices stored in memory. 762 763 procedure :: build_tchi => gwr_build_tchi 764 ! Build the irreducible polarizability 765 766 procedure :: redistrib_gt_kibz => gwr_redistrib_gt_kibz 767 ! Redistribute/deallocate G_k 768 769 procedure :: redistrib_mats_qibz => gwr_redistrib_mats_qibz 770 ! Redistribute/deallocate tchi_q or Wc_q 771 772 procedure :: build_wc => gwr_build_wc 773 ! Build the correlated part of the screened interaction. 774 775 procedure :: build_sigmac => gwr_build_sigmac 776 ! Build the correlated part of the self-energy GWc 777 ! and compute matrix elements in the KS representation. 778 779 procedure :: rpa_energy => gwr_rpa_energy 780 ! Compute RPA energy. 781 782 procedure :: gamma_gw => gwr_gamma_gw 783 784 procedure :: build_chi0_head_and_wings => gwr_build_chi0_head_and_wings 785 ! Compute head and wings of chi0 786 787 procedure :: build_sigxme => gwr_build_sigxme 788 ! Compute matrix elements of the exchange part. 789 790 procedure :: get_u_ngfft => gwr_get_u_ngfft 791 ! Compute FFT mesh from boxcutmin 792 793 procedure :: run_g0w0 => gwr_run_g0w0 794 ! Compute QP corrections with one-shot G0W0. 795 796 procedure :: run_chi0 => gwr_run_chi0 797 ! Compute CHI0 only. 798 799 procedure :: run_energy_scf => gwr_run_energy_scf 800 ! Compute QP corrections with energy-only self-consistent GW 801 802 procedure :: check_scf_cycle => gwr_check_scf_cycle 803 ! Check SCF cycle for convergence. 804 805 procedure :: ncwrite_tchi_wc => gwr_ncwrite_tchi_wc 806 ! Write tchi or wc to netcdf file 807 808 end type gwr_t
m_gwr/gwr_wcq_to_scbox [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
gwr_wcq_to_scbox
FUNCTION
INPUTS
OUTPUT
SOURCE
2629 subroutine gwr_wcq_to_scbox(gwr, sc_ngfft, desc_myqbz, wc_scgvec, my_ir, ndat, & 2630 wc_gpr, wct_scbox, wct_scbox_win) 2631 2632 !Arguments ------------------------------------ 2633 class(gwr_t),target,intent(in) :: gwr 2634 integer,intent(in) :: sc_ngfft(18) 2635 integer,intent(out) :: wc_scgvec(3, gwr%tchi_mpw) 2636 type(desc_t),intent(inout) :: desc_myqbz(gwr%my_nqbz) 2637 type(__slkmat_t),intent(in) :: wc_gpr(gwr%my_nqbz) 2638 integer,intent(in) :: my_ir, ndat 2639 complex(gwpc),intent(out) :: wct_scbox(product(sc_ngfft(4:6))*gwr%nspinor, gwr%sc_batch_size) 2640 !complex(gwpc),intent(out) :: wct_scbox(:,:) 2641 integer,optional,intent(inout) :: wct_scbox_win 2642 2643 !Local variables------------------------------- 2644 integer :: my_iqf, iq_bz, idat, iepoch, ii, idat_list(gwr%kpt_comm%nproc) ! gg(3), ig, 2645 !real(dp) :: tsec(2) !, cpu, wall, gflops 2646 2647 ! ************************************************************************* 2648 2649 !call timab(1930, 1, tsec) 2650 2651 ! Take the union of (q,g') for q in the BZ. Note gwr%ngqpt instead of gwr%ngkpt. 2652 2653 if (.not. present(wct_scbox_win)) then 2654 wct_scbox = czero_gw 2655 do my_iqf=1,gwr%my_nqbz 2656 iq_bz = gwr%my_qbz_inds(my_iqf) 2657 2658 #if 1 2659 call desc_myqbz(my_iqf)%to_scbox(gwr%qbz(:,iq_bz), gwr%ngqpt, sc_ngfft, gwr%nspinor*ndat, & 2660 wc_gpr(my_iqf)%buffer_cplx(1,my_ir), wct_scbox) 2661 #else 2662 gg = nint(gwr%qbz(:, iq_bz) * gwr%ngqpt) 2663 associate (desc_q => desc_myqbz(my_iqf)) 2664 do ig=1,desc_q%npw 2665 wc_scgvec(:,ig) = gg + gwr%ngqpt * desc_q%gvec(:,ig) ! q + g' 2666 end do 2667 call gsph2box(sc_ngfft, desc_q%npw, gwr%nspinor * ndat, wc_scgvec, & 2668 wc_gpr(my_iqf)%buffer_cplx(1,my_ir), wct_scbox) 2669 end associate 2670 #endif 2671 end do ! my_iqf 2672 2673 else 2674 ! Each MPI proc operates on a different idat vector at each epoch 2675 idat_list = cshift([(ii, ii=1,gwr%kpt_comm%nproc)], shift=-gwr%kpt_comm%me) 2676 2677 do iepoch=1,gwr%kpt_comm%nproc 2678 call xmpi_win_fence(wct_scbox_win) 2679 idat = idat_list(iepoch) 2680 if (idat > ndat) goto 10 2681 if (iepoch == 1) wct_scbox(:,idat) = czero_gw 2682 2683 do my_iqf=1,gwr%my_nkbz 2684 iq_bz = gwr%my_qbz_inds(my_iqf) 2685 #if 1 2686 call desc_myqbz(my_iqf)%to_scbox(gwr%qbz(:,iq_bz), gwr%ngqpt, sc_ngfft, gwr%nspinor * ndat1, & 2687 wc_gpr(my_iqf)%buffer_cplx(1,my_ir+idat-1), wct_scbox(:,idat)) 2688 #else 2689 gg = nint(gwr%qbz(:, iq_bz) * gwr%ngqpt) 2690 associate (desc_q => desc_myqbz(my_iqf)) 2691 do ig=1,desc_q%npw 2692 wc_scgvec(:,ig) = gg + gwr%ngqpt * desc_q%gvec(:,ig) ! q + g' 2693 end do 2694 call gsph2box(sc_ngfft, desc_q%npw, gwr%nspinor * ndat1, wc_scgvec, & 2695 wc_gpr(my_iqf)%buffer_cplx(1,my_ir+idat-1), wct_scbox(:,idat)) 2696 end associate 2697 #endif 2698 end do ! my_iqf 2699 10 continue 2700 !call xmpi_barrier(gwr%kpt_comm%value) 2701 !IF (.not. MPI_ASYNC_PROTECTS_NONBLOCKING) CALL MPI_F_SYNC_REG(wct_scbox) 2702 call xmpi_win_fence(wct_scbox_win) 2703 end do ! iepoch 2704 end if 2705 2706 !call timab(1930, 2, tsec) 2707 2708 end subroutine gwr_wcq_to_scbox
m_gwr/sc_sum [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
sc_sum
FUNCTION
INPUTS
OUTPUT
SOURCE
7962 subroutine sc_sum(sc_shape, uc_ngfft, nspinor, ph1d, k_is_gamma, alpha, sc_data, uc_psi, cout) 7963 7964 !Arguments ------------------------------------ 7965 integer,intent(in) :: sc_shape(3), uc_ngfft(18), nspinor 7966 complex(gwpc),intent(in) :: ph1d(maxval(sc_shape), 3) 7967 logical,intent(in) :: k_is_gamma 7968 complex(gwpc),target,intent(in) :: alpha, uc_psi(uc_ngfft(1)*uc_ngfft(2)*uc_ngfft(3)*nspinor) 7969 complex(gwpc),target,intent(in) :: & 7970 sc_data(uc_ngfft(1)*sc_shape(1)*uc_ngfft(2)*sc_shape(2)*uc_ngfft(3)*sc_shape(3)*nspinor) 7971 complex(gwpc),intent(out) :: cout 7972 7973 !Local variables------------------------------- 7974 integer :: il1, il2, il3, spinor, uc_n1, uc_n2, uc_n3, ix, iy, iz !, idat 7975 complex(gwpc) :: cphase, phl32, phl3 7976 complex(gwpc),contiguous,pointer :: uc_psi_ptr(:,:,:,:), sc_data_ptr(:,:,:,:,:,:,:) 7977 7978 ! ************************************************************************* 7979 7980 uc_n1 = uc_ngfft(1); uc_n2 = uc_ngfft(2); uc_n3 = uc_ngfft(3) 7981 7982 call c_f_pointer(c_loc(uc_psi), uc_psi_ptr, shape=[uc_n1, uc_n2, uc_n3, nspinor]) 7983 call c_f_pointer(c_loc(sc_data), sc_data_ptr, & 7984 shape=[uc_n1, sc_shape(1), uc_n2, sc_shape(2), uc_n3, sc_shape(3), nspinor]) 7985 7986 ABI_CHECK(nspinor == 1, "nspinor 2 not coded") 7987 spinor = 1 7988 cout = zero 7989 7990 if (k_is_gamma) then 7991 ! Don't need to multiply by e^{ik.L} 7992 do il3=1,sc_shape(3) 7993 do iz=1,uc_n3 7994 do il2=1,sc_shape(2) 7995 do iy=1,uc_n2 7996 do il1=1,sc_shape(1) 7997 do ix=1,uc_n1 7998 cout = cout + uc_psi_ptr(ix, iy, iz, spinor) * sc_data_ptr(ix, il1, iy, il2, iz, il3, spinor) 7999 end do 8000 end do 8001 end do 8002 end do 8003 end do 8004 end do 8005 8006 else 8007 ! Need to multiply by e^{ik.L} 8008 do il3=1,sc_shape(3) 8009 phl3 = ph1d(il3, 3) 8010 do iz=1,uc_n3 8011 do il2=1,sc_shape(2) 8012 phl32 = phl3 * ph1d(il2, 2) 8013 do iy=1,uc_n2 8014 do il1=1,sc_shape(1) 8015 cphase = phl32 * ph1d(il1, 1) ! e^{ik.L} 8016 do ix=1,uc_n1 8017 cout = cout + cphase * uc_psi_ptr(ix, iy, iz, spinor) * sc_data_ptr(ix, il1, iy, il2, iz, il3, spinor) 8018 end do 8019 end do 8020 end do 8021 end do 8022 end do 8023 end do 8024 end if 8025 8026 cout = alpha * cout 8027 8028 end subroutine sc_sum
m_gwr/sig_braket_ur [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
sig_braket_ur
FUNCTION
Integrate self-energy matrix elements in the unit cell.
INPUTS
OUTPUT
SOURCE
6048 subroutine sig_braket_ur(sig_rpr, nfftsp, ur_glob, sigm_pm) 6049 6050 !Arguments ------------------------------------ 6051 integer,intent(in) :: nfftsp 6052 type(__slkmat_t),intent(in) :: sig_rpr(2,2) 6053 complex(gwpc),intent(in) :: ur_glob(nfftsp) 6054 complex(gwpc),intent(out) :: sigm_pm(2) 6055 6056 !Local variables------------------------------- 6057 integer :: ipm, ir1, il_r1 !, ierr 6058 complex(gwpc),allocatable :: loc_cwork(:) 6059 6060 ! ************************************************************************* 6061 6062 ! (r',r) with r' local and r-index PBLAS-distributed. 6063 6064 sigm_pm = czero_gw 6065 do ipm=1,2 6066 associate (rp_r => sig_rpr(1,ipm)) 6067 ! Integrate over r' 6068 !ABI_CHECK_IEQ(nfftsp, rp_r%sizeb_local(1), "First dimension should be local to each MPI proc!") 6069 ABI_MALLOC(loc_cwork, (rp_r%sizeb_local(2))) 6070 loc_cwork(:) = matmul(transpose(rp_r%buffer_cplx), ur_glob) 6071 ! TODO 6072 !call xgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc ) 6073 ! Integrate over r. Note complex conjugate. 6074 do il_r1=1,rp_r%sizeb_local(2) 6075 ir1 = rp_r%loc2gcol(il_r1) 6076 sigm_pm(ipm) = sigm_pm(ipm) + conjg(ur_glob(ir1)) * loc_cwork(il_r1) 6077 end do 6078 ABI_FREE(loc_cwork) 6079 end associate 6080 end do 6081 6082 end subroutine sig_braket_ur
m_gwr/write_notations [ Functions ]
[ Top ] [ m_gwr ] [ Functions ]
NAME
write_notations
FUNCTION
Write the meaning of the different columns.
SOURCE
6008 subroutine write_notations(units) 6009 integer,intent(in) :: units(:) 6010 integer :: ii, unt 6011 6012 do ii=1,size(units) 6013 unt = units(ii) 6014 write(unt,"(a)")repeat("=", 80) 6015 write(unt,"(a)")" QP results (energies in eV)" 6016 write(unt,"(a)")" Notations:" 6017 write(unt,"(a)")" E0: Kohn-Sham energy" 6018 write(unt,"(a)")" <VxcDFT>: Matrix elements of Vxc[n_val] without non-linear core correction (if any)" 6019 write(unt,"(a)")" SigX: Matrix elements of Sigma_x" 6020 write(unt,"(a)")" SigC(E0): Matrix elements of Sigma_c at E0" 6021 write(unt,"(a)")" Z: Renormalization factor" 6022 write(unt,"(a)")" E-E0: Difference between the QP and the KS energy." 6023 write(unt,"(a)")" E-Eprev: Difference between QP energy at iteration i and i-1" 6024 write(unt,"(a)")" E: Quasi-particle energy" 6025 write(unt,"(a)")" Occ(E): Occupancy of QP state" 6026 !write(unt,"(a)")" SE1(eKS): Real part of the self-energy computed at the KS energy, SE2 for imaginary part." 6027 write(unt,"(a)")" " 6028 write(unt,"(a)")" " 6029 end do 6030 end subroutine write_notations