TABLE OF CONTENTS
ABINIT/dfpt_cgwf [ Functions ]
NAME
dfpt_cgwf
FUNCTION
Update one single wavefunction (cwavef), non self-consistently. Uses a conjugate-gradient algorithm. Try to keep close to the formulas in PRB 55, 10337 (1997) [[cite:Gonze1997]], for the non-self-consistent case, except that we are computing here the second-derivative of the total energy, and not E(2). There is a factor of 2 between the two quantities. The wavefunction that is generated is always orthogonal to cgq. It is orthogonal to the active Hilbert space, and will be complemented by contributions from the active space in the calling routine, if needed. As concerns the MPI algorithm: cgq and gscq are distributed inside comm_band and each proc has nband_me non-overlapping blocks. Each proc in comm_band call dfpt_cgwf with a different u^1_{band} state (the band index is therefore LOCAL) but then we need to communicate every time we call projbd to orthogonalize wrt the MPI-distributed cgq. Other arrays such as rank_band and bands_treated_now are GLOBAL i.e. all procs in comm_band are supposed to call dfpt_cgwf with the same values.
INPUTS
u1_band_=which particular band we are converging (LOCAL) A negative value is used when the routine is called in band-mode with MPI-distributed cgq to indicate that this proc is not optimizing abs(u1_band). Used when calling dfpt_cgw in EPH. band_me=cpu-local index in cgq array of band which we are converging. berryopt=option for Berry phase cgq(2,mcgq)=wavefunction coefficients for MY bands at k+Q cwave0(2,npw*nspinor)=GS wavefunction at k, in reciprocal space cwaveprj0(natom,nspinor*usecprj)=GS wave function at k projected with nl projectors rank_band(nband)=rank of processor in band_comm which have the other bands for cgq below (GLOBAL) bands_treated_now(nband) (GLOBAL) eig0_k=0-order eigenvalues for the present wavefunction at k eig0_kq(nband)=GS eigenvalues at k+Q (hartree) grad_berry(2,mpw1,dtefield%mband_occ) = the gradient of the Berry phase term gscq(2,mgscq)=<g|S|Cnk+q> coefficients for MY bands (PAW) at k+Q gs_hamkq <type(gs_hamiltonian_type)>=all data for the Hamiltonian at k+Q icgq=shift to be applied on the location of data in the array cgq igscq=shift to be applied on the location of data in the array gscq idir=direction of the perturbation ipert=type of the perturbation mcgq=second dimension of the cgq array mgscq=second dimension of gscq, with only mband_mem bands mpi_enreg=information about MPI parallelization mpw1=maximum number of planewave for first-order wavefunctions natom=number of atoms in cell. nband=number of bands. nband_me=number of bands on this cpu nbdbuf=number of buffer bands for the minimisation nline=number of line minimizations per band. npw=number of planewaves in basis sphere at given k. npw1=number of planewaves in basis sphere at k+Q nspinor=number of spinorial components of the wavefunctions opt_gvnlx1=option controlling the use of gvnlx1 array: 0: used as an output 1: used as an input: - used only for ipert=natom+2 NCPP: contains the ddk 1-st order WF PAW: contains frozen part of 1st-order hamiltonian 2: used as input/ouput: - used only for PAW and ipert=natom+2 At input: contains the ddk 1-st order WF (times i) At output: contains frozen part of 1st-order hamiltonian prtvol=control print volume and debugging output quit= if 1, proceeds to smooth ending of the job. dfpt_sciss=scissor shift (Ha) tolrde=tolerance on the ratio of differences of energies (for the line minimisation) tolwfr=tolerance on largest wf residual usedcwavef=flag controlling the use of dcwavef array (PAW only): 0: not used (not allocated) 1: used as input 2: used as output wfoptalg=govern the choice of algorithm for wf optimisation (0 or 10, at present)
OUTPUT
eig1_k(2*nband**2)=matrix of first-order eigenvalues (hartree) eig1(:,ii,jj)=<C0 ii|H1|C0 jj> for norm-conserving psps eig1(:,ii,jj)=<C0 ii|H1-(eig0_k+eig0_k+q)/2.S(1)|C0 jj> for PAW ghc(2,npw1*nspinor)=<G|H0-eig0_k.I|C1 band,k> (NCPP) or <G|H0-eig0_k.S0|C1 band,k> (PAW) gvnlxc(2,npw1*nspinor)=<G|Vnl+VFockACE|C1 band,k> gvnlx1(2,npw1*nspinor)= part of <G|K1+Vnl1|C0 band,k> not depending on VHxc1 (NCPP) or part of <G|K1+Vnl1-eig0k.S1|C0 band,k> not depending on VHxc1 (PAW) resid=wf residual for current band gh1c_n= <G|H1|C0 band,k> (NCPP) or <G|H1-eig0k.S1|C0 band,k> (PAW). This vector is not projected on the subspace orthogonal to the WF. === if gs_hamkq%usepaw==1 === gsc(2,npw1*nspinor*usepaw)=<G|S0|C1 band,k>
SIDE EFFECTS
Input/Output: cwavef(2,npw1*nspinor)=first-order wavefunction at k,q, in reciprocal space (updated) === if gs_hamkq%usepaw==1 === cwaveprj(natom,nspinor)= wave functions at k projected with nl projectors === if also usedcwavef>0 === dcwavef(2,npw1*nspinor)=change of wavefunction due to change of overlap: dcwavef is delta_Psi(1)=-1/2.Sum_{j}[<C0_k+q_j|S(1)|C0_k_i>.|C0_k+q_j>] see PRB 78, 035105 (2008) [[cite:Audouze2008]], Eq. (42) input if usedcwavef=1, output if usedcwavef=2
SOURCE
152 subroutine dfpt_cgwf(u1_band_,band_me,rank_band,bands_treated_now,berryopt,cgq,cwavef,cwave0,cwaveprj,cwaveprj0,& 153 & rf2,dcwavef,& 154 & eig0_k,eig0_kq,eig1_k,ghc,gh1c_n,grad_berry,gsc,gscq,& 155 & gs_hamkq,gvnlxc,gvnlx1,icgq,idir,ipert,igscq,& 156 & mcgq,mgscq,mpi_enreg,mpw1,natom,nband,nband_me,nbdbuf,nline_in,npw,npw1,nspinor,& 157 & opt_gvnlx1,prtvol,quit,resid,rf_hamkq,dfpt_sciss,tolrde,tolwfr,& 158 & usedcwavef,wfoptalg,nlines_done) 159 160 !Arguments ------------------------------------ 161 !scalars 162 integer,intent(in) :: u1_band_,berryopt 163 integer,intent(in) :: band_me, nband_me 164 integer,intent(in) :: icgq,idir,igscq,ipert,mcgq,mgscq,mpw1,natom,nband 165 integer,intent(in) :: nbdbuf,nline_in,npw,npw1,nspinor,opt_gvnlx1 166 integer,intent(in) :: prtvol,quit,usedcwavef,wfoptalg 167 integer,intent(inout) :: nlines_done 168 real(dp),intent(in) :: dfpt_sciss,tolrde,tolwfr 169 real(dp),intent(out) :: resid 170 type(MPI_type),intent(in) :: mpi_enreg 171 type(rf2_t), intent(in) :: rf2 172 type(gs_hamiltonian_type),intent(inout) :: gs_hamkq 173 type(rf_hamiltonian_type),intent(inout) :: rf_hamkq 174 !arrays 175 integer,intent(in) :: rank_band(nband) 176 integer,intent(in) :: bands_treated_now (nband) 177 real(dp),intent(in) :: cgq(2,mcgq),eig0_kq(nband) 178 real(dp),intent(in) :: eig0_k(nband) 179 real(dp),intent(in) :: grad_berry(2,mpw1*nspinor,nband),gscq(2,mgscq) 180 real(dp),intent(inout) :: cwave0(2,npw*nspinor),cwavef(2,npw1*nspinor) 181 real(dp),intent(inout) :: dcwavef(2,npw1*nspinor*((usedcwavef+1)/2)) 182 real(dp),intent(inout) :: eig1_k(2*nband**2) 183 real(dp),intent(out) :: gh1c_n(2,npw1*nspinor) 184 real(dp),intent(out) :: ghc(2,npw1*nspinor) 185 real(dp),intent(out) :: gsc(2,npw1*nspinor*gs_hamkq%usepaw) 186 real(dp),intent(inout) :: gvnlx1(2,npw1*nspinor),gvnlxc(2,npw1*nspinor) 187 type(pawcprj_type),intent(inout) :: cwaveprj(natom,nspinor) 188 type(pawcprj_type),intent(inout) :: cwaveprj0(natom,nspinor*gs_hamkq%usecprj) 189 190 !Local variables------------------------------- 191 !scalars 192 integer,parameter :: level=15,tim_getgh1c=1,tim_getghc=2,tim_projbd=2 193 integer,save :: nskip=0 194 integer :: cpopt,iband,igs,iline,indx_cgq,ipw,me_g0,comm_fft 195 integer :: iband_me, jband_me, ierr, me_band, np_band, band_off, u1_band !, unit_me 196 integer :: ipws,ispinor,istwf_k,jband,nline,optlocal,optnl,dc_shift_band,sij_opt 197 integer :: test_is_ok,useoverlap,usepaw,usevnl,usetolrde 198 real(dp) :: d2edt2,d2te,d2teold,dedt,deltae,deold,dotgg 199 real(dp) :: dotgp,doti,dotr,eshift,eshiftkq,gamma,optekin,prod1,prod2 200 real(dp) :: theta,tol_restart,u1h0me0u1 201 logical :: gen_eigenpb 202 integer :: skipme, bands_skipped_now(nband) 203 character(len=500) :: msg 204 !arrays 205 real(dp) :: dummy(0,0),tsec(2) 206 real(dp) :: eig1_k_loc(2,nband,nband) 207 real(dp),allocatable :: conjgr(:,:),cwaveq(:,:),cwwork(:,:),direc(:,:) 208 real(dp),allocatable :: gberry(:,:),gh1c(:,:),gh_direc(:,:),gresid(:,:),gvnlx1_saved(:,:) 209 real(dp),allocatable :: gs1c(:,:),gvnlx_direc(:,:),pcon(:),sconjgr(:,:) 210 real(dp),allocatable :: scprod(:,:),work(:,:),work1(:,:),work2(:,:) 211 real(dp),pointer :: kinpw1(:) 212 type(pawcprj_type),allocatable :: conjgrprj(:,:) 213 type(pawcprj_type) :: cprj_dummy(0,0) 214 215 ! ********************************************************************* 216 217 DBG_ENTER("COLL") 218 219 call timab(122,1,tsec) 220 221 !====================================================================== 222 !========= LOCAL VARIABLES DEFINITIONS AND ALLOCATIONS ================ 223 !==================================================================== 224 225 u1_band = abs(u1_band_) 226 227 nline = nline_in 228 usetolrde = 1 229 230 ! LB-23/04/17: 231 ! For ipert=natom+10 or ipert=natom+11, the Sternheimer equation is non-self-consistent, so we have 232 ! to solve a true linear problem (A.X = B) for each kpoint and band. In this case, the conjugate 233 ! gradient algorithm can find the exact solution (within the numerical precision) with only ONE call 234 ! of dfpt_cgwf (per kpoint and band...). The solution is found with at most N iterations, N being the dimension of X. 235 ! In order to avoid useless scfcv loops (and many calls of rf2_init, which can be time consuming), 236 ! we want to leave this routine only if 'tolwfr' is reached. Consequently, 'tolrde' is not used and 'nline' is set to 100. 237 ! One could use nline=npw1*nspinor (>> 100 !) instead, but when the method cannot converge (i.e when tolwfr is lower than the 238 ! numerical noise) the program could be stuck here for a very long time. 239 ! NOTE : This is also true for ipert==natom+1, but a lot of references in the test suite have to be changed... 240 if(ipert==natom+10.or.ipert==natom+11) then 241 nline = 100 ! The default value is only 4... This should be sufficient to converge with nstep=1 or 2 242 if (nline_in > 100) nline = nline_in ! Keep the possibility to increase nline 243 usetolrde = 0 ! see below 244 end if 245 246 if (prtvol>=10) then 247 !Tell us what is going on: 248 write(msg,'(a,i6,2x,a,i3,a)')' --- dfpt_cgwf is called for band',u1_band,'for',nline,' lines' 249 call wrtout(std_out,msg) 250 end if 251 252 me_g0 = mpi_enreg%me_g0 253 comm_fft = mpi_enreg%comm_fft 254 me_band = mpi_enreg%me_band 255 np_band = mpi_enreg%nproc_band 256 !unit_me = 300+u1_band 257 258 skipme = 0 259 260 ! if PAW, one has to solve a generalized eigenproblem 261 usepaw=gs_hamkq%usepaw 262 gen_eigenpb=(usepaw==1) 263 useoverlap=0;if (gen_eigenpb) useoverlap=1 264 265 ! Use scissor shift on 0-order eigenvalue 266 eshift=eig0_k(u1_band)-dfpt_sciss 267 268 ! Additional initializations 269 istwf_k=gs_hamkq%istwf_k 270 optekin=0;if (wfoptalg>=10) optekin=1 271 tol_restart=tol12;if (gen_eigenpb) tol_restart=tol8 272 if (ipert == natom+10 .or. ipert == natom+11) tol_restart = tol7 273 274 kinpw1 => gs_hamkq%kinpw_kp 275 276 ! Memory allocations 277 ABI_MALLOC(gh1c,(2,npw1*nspinor)) 278 ABI_MALLOC(pcon,(npw1)) 279 ABI_MALLOC(scprod,(2,nband_me)) 280 281 if (berryopt== 4.or.berryopt== 6.or.berryopt== 7.or. berryopt==14.or.berryopt==16.or.berryopt==17) then 282 ABI_MALLOC(gberry,(2,npw1*nspinor)) 283 gberry(:,1:npw1*nspinor)=grad_berry(:,1:npw1*nspinor,u1_band) 284 else 285 ABI_MALLOC(gberry,(0,0)) 286 end if 287 288 !TODO MJV: this should probably be adjusted as well for natom+10/11 perts: set to band_me instead of u1_band 289 dc_shift_band=(u1_band-1)*npw1*nspinor 290 291 ! this is used many times - no use de and re allocating 292 ABI_MALLOC(work,(2,npw1*nspinor)) 293 294 !DEBUG!! Several checking statements 295 if (prtvol==-level.or.prtvol==-19.or.prtvol==-20) then 296 write(msg,'(a)') " ** cgwf3 : debugging mode, tests will be done" 297 ! Search CGWF3_WARNING in the log file to find errors (if any) 298 call wrtout(std_out,msg) 299 ABI_MALLOC(work1,(2,npw1*nspinor)) 300 ! ===== Check <Psi_k+q^(0)|S(0)|Psi_k+q^(0)>=delta_{ij} 301 if (.not.gen_eigenpb) work1(:,:)=cgq(:,1+npw1*nspinor*(band_me-1)+icgq:npw1*nspinor*band_me+icgq) 302 if ( gen_eigenpb) work1(:,:)=gscq(:,1+npw1*nspinor*(band_me-1)+igscq:npw1*nspinor*band_me+igscq) 303 ! NB: this loop is not band-block diagonal 304 ! the present logic does a lot of communication: 1 for each band. 305 ! Could be grouped outside the jband loop into 1 big one, but if the wf are big this is a waste. Tradeoffs... 306 jband_me = 0 307 do jband=1,nband 308 if (bands_treated_now(jband) == 0) cycle 309 if (rank_band(jband) == me_band) then 310 jband_me = jband_me + 1 311 work(:,:)=cgq(:,1+npw1*nspinor*(jband_me-1)+icgq:npw1*nspinor*jband_me+icgq) 312 end if 313 ! send to everyone else, who is also working on jband right now 314 call xmpi_bcast(work,rank_band(jband),mpi_enreg%comm_band,ierr) 315 316 call dotprod_g(dotr,doti,istwf_k,npw1*nspinor,2,work1,work,me_g0,mpi_enreg%comm_spinorfft) 317 test_is_ok=1 318 if(jband==u1_band) then 319 if(abs(dotr-one)>tol12) test_is_ok=0 320 else 321 if(abs(dotr)>tol12) test_is_ok=0 322 end if 323 if(abs(doti)>tol12) test_is_ok=0 324 if(test_is_ok/=1) then 325 write(msg,'(a,i3,a,2es22.15)') "CGWF3_WARNING : <Psi_k+q,i^(0)|S(0)|Psi_k+q,j^(0)> for band j=",jband," is ",dotr,doti 326 call wrtout(std_out,msg) 327 end if 328 end do 329 330 ! ===== Check Pc.Psi_k+q^(0)=0 331 ! each jband is checked by everybody, against the bands attributed to present cpu 332 ! NB - this does not depend on the "band" input 333 jband_me = 0 334 do jband=1,nband 335 if (bands_treated_now(jband) == 0) cycle 336 if (rank_band(jband) == me_band) then 337 jband_me = jband_me + 1 338 work(:,:)=cgq(:,1+npw1*nspinor*(jband_me-1)+icgq:npw1*nspinor*jband_me+icgq) 339 end if 340 ! send to everyone else, who is also working on jband right now 341 call xmpi_bcast(work,rank_band(jband),mpi_enreg%comm_band,ierr) 342 work1 = work 343 344 call projbd(cgq,work,-1,icgq,igscq,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 345 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 346 347 ! if bands are parallelized, I have only projected against bands on my cpu 348 ! Pc|work> = |work> - Sum_l <psi_{k+q, l}|work> |psi_{k+q, l}> 349 ! = Sum_nproc_band (|work> - Sum_{my l} <psi_{k+q, l}|work> |psi_{k+q, l}>) - (nproc_band-1) |work> 350 ! 351 if (mpi_enreg%nproc_band > 1) then 352 call xmpi_sum(work,mpi_enreg%comm_band,ierr) 353 !TODO: make this a blas call? zaxpy 354 work = work - (mpi_enreg%nproc_band-1)*work1 355 end if 356 357 call sqnorm_g(dotr,istwf_k,npw1*nspinor,work,me_g0,comm_fft) 358 if(sqrt(dotr)>tol12) then 359 write(msg,'(a,i3,a,es22.15)') "CGWF3_WARNING : Norm of Pc.Psi_k+q_j^(0) for band j=",jband," is ",sqrt(dotr) 360 call wrtout(std_out,msg) 361 end if 362 end do 363 364 ! ===== Check Pc.Psi_k^(0)=0 365 ! NB: this _does_ depend on the input band "band" stored in cwave0 366 do iband = 1, nband 367 if (bands_treated_now(iband) == 0) cycle 368 if (rank_band(iband) == me_band) work(:,:)=cwave0(:,:) 369 370 ! send to everyone else, who is also working on jband right now 371 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 372 work1 = work 373 call projbd(cgq,work,-1,icgq,igscq,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 374 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 375 376 if (mpi_enreg%nproc_band > 1) then 377 call xmpi_sum(work,mpi_enreg%comm_band,ierr) 378 !TODO: make this a blas call? zaxpy 379 work = work - (mpi_enreg%nproc_band-1)*work1 380 end if 381 382 call sqnorm_g(dotr,istwf_k,npw1*nspinor,work,me_g0,comm_fft) 383 if(sqrt(dotr)>tol12) then 384 write(msg,'(a,i3,a,es22.15)') "CGWF3_WARNING : Norm of Pc.Psi_k^(0) for band ",u1_band," is ",sqrt(dotr) 385 call wrtout(std_out,msg) 386 end if 387 end do 388 389 390 ! ===== Check Pc.dcwavef=0 (for 2nd order only) 391 if(ipert==natom+10.or.ipert==natom+11) then 392 do iband = 1, nband 393 if (bands_treated_now(iband) == 0) cycle 394 if (rank_band(iband) == me_band) then 395 work(:,:)=rf2%dcwavef(:,1+dc_shift_band:npw1*nspinor+dc_shift_band) 396 end if 397 ! send to everyone else, who is also working on jband right now 398 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 399 work1 = work 400 call projbd(cgq,work,-1,icgq,igscq,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 401 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 402 403 if (mpi_enreg%nproc_band > 1) then 404 call xmpi_sum(work,mpi_enreg%comm_band,ierr) 405 !TODO: make this a blas call? zaxpy 406 work = work - (mpi_enreg%nproc_band-1)*work1 407 end if 408 409 call sqnorm_g(dotr,istwf_k,npw1*nspinor,work,me_g0,comm_fft) 410 if(sqrt(dotr)>tol10) then 411 write(msg,'(a,i3,a,es22.15)') "CGWF3_WARNING : Norm of Pc.dcwavef for band ",u1_band," is ",sqrt(dotr) 412 call wrtout(std_out,msg) 413 end if 414 end do 415 end if 416 417 ! ===== Check Pc^*.S(0).Psi_k+q^(0)=0 418 ! NB: here again does not depend on input "u1_band" 419 if (gen_eigenpb) then 420 jband_me = 0 421 do jband=1,nband 422 if (bands_treated_now(jband) == 0) cycle 423 if (rank_band(jband) == me_band) then 424 jband_me = jband_me + 1 425 work(:,:)=gscq(:,1+npw1*nspinor*(jband_me-1)+igscq:npw1*nspinor*jband_me+igscq) 426 end if 427 ! send to everyone else, who is also working on jband right now 428 call xmpi_bcast(work,rank_band(jband),mpi_enreg%comm_band,ierr) 429 work1 = work 430 431 call projbd(gscq,work,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 432 cgq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 433 434 if (mpi_enreg%nproc_band > 1) then 435 call xmpi_sum(work,mpi_enreg%comm_band,ierr) 436 !TODO: make this a blas call? zaxpy 437 work = work - (mpi_enreg%nproc_band-1)*work1 438 end if 439 440 call sqnorm_g(dotr,istwf_k,npw1*nspinor,work,me_g0,comm_fft) 441 if(sqrt(dotr)>tol12) then 442 write(msg,'(a,i3,a,es22.15)') "CGWF3_WARNING : Norm of Pc^*.S(0).Psi_k+q_j^(0) for band j=",jband," is ",sqrt(dotr) 443 call wrtout(std_out,msg) 444 end if 445 end do 446 end if 447 ABI_FREE(work1) 448 end if 449 !ENDDEBUG!! Several checking statements 450 451 452 !====================================================================== 453 !========== INITIALISATION OF MINIMIZATION ITERATIONS ================= 454 !====================================================================== 455 456 if(ipert/=natom+10.and.ipert/=natom+11) then 457 ! The following is needed for first order perturbations only 458 ! Otherwise, the work is already done in rf2_init (called in dfpt_vtowfk.F90) 459 460 ! Compute H(1) applied to GS wavefunction Psi(0) 461 if (gen_eigenpb) then 462 sij_opt=1 463 ABI_MALLOC(gs1c,(2,npw1*nspinor)) 464 else 465 ABI_MALLOC(gs1c,(0,0)) 466 sij_opt=0 467 end if 468 usevnl=1; optlocal=1; optnl=2 469 if (prtvol==-level.or.prtvol==-19) then 470 ABI_MALLOC(gvnlx1_saved,(2,npw1*nspinor)) 471 gvnlx1_saved(:,:) = gvnlx1(:,:) 472 end if 473 call getgh1c(berryopt,cwave0,cwaveprj0,gh1c,gberry,gs1c,gs_hamkq,gvnlx1,idir,ipert,eshift,& 474 mpi_enreg,optlocal,optnl,opt_gvnlx1,rf_hamkq,sij_opt,tim_getgh1c,usevnl) 475 476 if (gen_eigenpb) then 477 if (ipert/=natom+2) then ! S^(1) is zero for ipert=natom+2 478 !$OMP PARALLEL 479 !$OMP DO 480 do ipw=1,npw1*nspinor 481 gh1c (1:2,ipw)=gh1c (1:2,ipw)-eshift*gs1c(1:2,ipw) 482 end do 483 !$OMP END DO NOWAIT 484 if (opt_gvnlx1/=1) then 485 !$OMP DO 486 do ipw=1,npw1*nspinor 487 gvnlx1(1:2,ipw)=gvnlx1(1:2,ipw)-eshift*gs1c(1:2,ipw) 488 end do 489 !$OMP END DO NOWAIT 490 end if 491 !$OMP END PARALLEL 492 end if 493 494 ! If generalized eigenPb and dcwavef requested, compute it: 495 ! dcwavef is delta_Psi(1)=-1/2.Sum_{j}[<C0_k+q_j|S(1)|C0_k_i>.|C0_k+q_j>] 496 ! see PRB 78, 035105 (2008) [[cite:Audouze2008]], Eq. (42) 497 if (usedcwavef==2) then 498 call getdc1(u1_band,rank_band,bands_treated_now,cgq,cprj_dummy,dcwavef,cprj_dummy,& 499 & 0,icgq,istwf_k,mcgq,0,& 500 & mpi_enreg,natom,nband,nband_me,npw1,nspinor,0,gs1c) 501 end if 502 end if ! gen_eigenpb 503 504 else 505 ! 2nd order case (wrt k perturbation) 506 ! Copy RHS_Stern(:,:) of the given band in gh1c 507 gh1c(:,:)=rf2%RHS_Stern(:,1+dc_shift_band:npw1*nspinor+dc_shift_band) 508 end if 509 510 if (prtvol==-level.and.usedcwavef==2) then 511 !Check that Pc^*.(H^(0)-E.S^(0)).delta_Psi^(1) is zero ! This is a consequence of P_c delta_Psi^(1) = 0 512 ABI_MALLOC(cwwork,(2,npw1*nspinor)) 513 do iband = 1, nband 514 if (bands_treated_now(iband) == 0) cycle 515 if (rank_band(iband) == me_band) then 516 cwwork=dcwavef 517 ! - Apply H^(0)-E.S^(0) to delta_Psi^(1) 518 sij_opt=0;if (gen_eigenpb) sij_opt=-1 519 cpopt=-1 520 ABI_MALLOC(work1,(2,npw1*nspinor*((sij_opt+1)/2))) 521 ABI_MALLOC(work2,(2,npw1*nspinor)) 522 call getghc(cpopt,cwwork,conjgrprj,work,work1,gs_hamkq,work2,eshift,mpi_enreg,& 523 1,prtvol,sij_opt,tim_getghc,0,select_k=KPRIME_H_KPRIME) 524 ABI_FREE(work1) 525 ABI_FREE(work2) 526 end if 527 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 528 cwwork=work 529 530 ! -Apply Pc^* 531 call projbd(gscq,cwwork,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 532 cgq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 533 534 call xmpi_sum(cwwork,mpi_enreg%comm_band,ierr) 535 536 if (mpi_enreg%nproc_band > 1) then 537 !TODO: make this a blas call? zaxpy 538 cwwork = cwwork - (mpi_enreg%nproc_band-1)*work 539 end if 540 541 call sqnorm_g(dotr,istwf_k,npw1*nspinor,cwwork,me_g0,comm_fft) 542 if(sqrt(dotr)>tol12) then 543 write(msg,'(a,i3,a,es22.15)') 'CGWF3_WARNING : |Pc^*.(H^(0)-E.S^(0)).delta_Psi^(1)| (band ',u1_band,')=',sqrt(dotr) 544 call wrtout(std_out,msg) 545 end if 546 end do 547 ABI_FREE(cwwork) 548 end if ! prtvol==-level.and.usedcwavef==2 549 550 call cg_zcopy(npw1*nspinor,gh1c,gh1c_n) 551 552 ! Projecting out all bands 553 ! While we could avoid calculating all the eig1_k to obtain the perturbed density, 554 ! we do need all of the matrix elements when outputting the full 1st-order wfn. 555 ! Note the subtlety: 556 ! -For the generalized eigenPb, S|cgq> is used in place of |cgq>, 557 ! in order to apply P_c+ projector (see PRB 73, 235101 (2006) [[cite:Audouze2006]], Eq. (71), (72)) 558 eig1_k_loc = zero 559 do iband = 1, nband 560 if (bands_treated_now(iband) == 0) cycle 561 if (rank_band(iband) == me_band) work = gh1c 562 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 563 564 if(gen_eigenpb)then 565 call projbd(gscq,work,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 566 cgq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 567 else 568 call projbd(cgq,work,-1,icgq,0,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 569 dummy,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 570 end if 571 572 ! sum projections against all bands k+q 573 call xmpi_sum_master(work, rank_band(iband), mpi_enreg%comm_band, ierr) 574 575 ! scprod now contains scalar products of band iband (runs over all bands in current queue) with local bands j 576 jband_me = 0 577 do jband=1,nband 578 if (rank_band(jband) /= me_band) cycle 579 jband_me = jband_me + 1 580 eig1_k_loc(:,jband,iband)=scprod(:,jband_me) 581 end do 582 583 ! save this for me only 584 !TODO: make this a blas call? zaxpy 585 if (rank_band(iband) == me_band) then 586 gh1c = work - (mpi_enreg%nproc_band-1)*gh1c 587 end if 588 589 end do !iband 590 591 592 if(ipert/=natom+10.and.ipert/=natom+11) then 593 ! For ipert=natom+10 or natom+11, this is done in rf2_init 594 595 ! The array eig1_k contains: 596 ! <u_(jband,k+q)^(0)|H_(k+q,k)^(1)|u_(iband,k)^(0)> (NC psps) 597 ! or <u_(jband,k+q)^(0)|H_(k+q,k)^(1)-(eig0_k+eig0_k+q)/2.S^(1)|u_(iband,k)^(0)> (PAW) 598 ! so in case of PAW need to add the overlap term below 599 ! 600 ! NB: 2019 11 15: MJV: I swapped the names of jband and iband to be more consistent with other loops above 601 if (gen_eigenpb) then 602 do iband=1,nband 603 if (bands_treated_now(iband) == 0) cycle 604 if (rank_band(iband) == me_band) work = gs1c 605 ! for iband on this proc, bcast to all others to get full line of iband,jband pairs 606 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 607 608 ! add PAW overlap correction term to present iband (all procs) and local jband elements 609 indx_cgq=icgq 610 do jband=1,nband 611 if (rank_band(jband) /= me_band) cycle 612 613 eshiftkq=half*(eig0_kq(jband)-eig0_k(iband)) 614 call dotprod_g(dotr,doti,istwf_k,npw1*nspinor,2,cgq(:,indx_cgq+1:indx_cgq+npw1*nspinor),work,& 615 me_g0,mpi_enreg%comm_spinorfft) 616 eig1_k_loc(1,jband,iband)=eig1_k_loc(1,jband,iband)-eshiftkq*dotr 617 eig1_k_loc(2,jband,iband)=eig1_k_loc(2,jband,iband)-eshiftkq*doti 618 indx_cgq=indx_cgq+npw1*nspinor 619 end do 620 end do ! iband 621 end if ! PAW and generalized eigenproblem 622 623 ! No more need of gs1c 624 ABI_FREE(gs1c) 625 626 ! reduce over band procs to fill in the matrix for all jband (distributed over procs) 627 ! must only do this once for eig1_k_loc: now have all jband for current ibands on all procs 628 call xmpi_sum(eig1_k_loc,mpi_enreg%comm_band,ierr) 629 630 ! TODO: I think this is just a reshape 631 do iband=1,nband 632 if (bands_treated_now(iband) == 0) cycle 633 band_off=(iband-1)*2*nband 634 do jband=1,nband 635 eig1_k(2*jband-1+band_off)=eig1_k_loc(1,jband,iband) 636 eig1_k(2*jband +band_off)=eig1_k_loc(2,jband,iband) 637 end do 638 end do 639 end if ! ipert/=natom+10.and.ipert/=natom+11 640 641 ! Filter the wavefunctions for large modified kinetic energy (see routine mkkin.f) 642 ! TODO: should this also be applied to cwaveq for the preconditioning with kinpw1 below? 643 do ispinor=1,nspinor 644 ipws=(ispinor-1)*npw1 645 !$OMP PARALLEL DO PRIVATE(ipw) SHARED(cwavef,kinpw1,ipws,npw1) 646 do ipw=1+ipws,npw1+ipws 647 if(kinpw1(ipw-ipws)>huge(zero)*1.d-11)then 648 cwavef(1:2,ipw)=zero 649 end if 650 end do 651 end do 652 653 ! Apply the orthogonality condition: <C1 k,q|C0 k+q>=0 (NCPP) or <C1 k,q|S0|C0 k+q>=0 (PAW) 654 ! Project out all bands from cwavef, i.e. apply P_c projector on cwavef 655 ! (this is needed when there are some partially or unoccupied states) 656 do iband = 1, nband 657 if (bands_treated_now(iband) == 0) cycle 658 659 if (rank_band(iband) == me_band) work = cwavef 660 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 661 662 call projbd(cgq,work,-1,icgq,igscq,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 663 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 664 665 call xmpi_sum_master(work, rank_band(iband), mpi_enreg%comm_band, ierr) 666 667 ! save this for me_band only 668 !TODO: make this a blas call? zaxpy 669 if (rank_band(iband) == me_band) then 670 cwavef = work - (mpi_enreg%nproc_band-1)*cwavef 671 end if 672 end do 673 674 675 if(ipert/=natom+10.and.ipert/=natom+11) then 676 ! If PAW, the orthogonality condition is <C1 k,q|S0|C0 k+q>+1/2<C0 k|S1|C0 k+q>=0 677 if (usepaw==1.and.usedcwavef>0) then 678 !$OMP PARALLEL DO 679 do ipw=1,npw1*nspinor 680 cwavef(1:2,ipw)=cwavef(1:2,ipw)+dcwavef(1:2,ipw) 681 end do 682 end if 683 else 684 ! In 2nd order case, dcwavef/=0 even in NC, and it is already computed in rf2_init (called in dfpt_vtowfk.F90) 685 do ipw=1,npw1*nspinor 686 cwavef(:,ipw)=cwavef(:,ipw)+rf2%dcwavef(:,ipw+dc_shift_band) 687 end do 688 end if 689 690 if (u1_band>max(1,nband-nbdbuf))then 691 ! Treat the case of buffer bands 692 cwavef=zero 693 ghc =zero 694 gvnlxc =zero 695 if (gen_eigenpb) gsc=zero 696 if (usedcwavef==2) dcwavef=zero 697 if (usepaw==1) then 698 call pawcprj_set_zero(cwaveprj) 699 end if 700 if (usedcwavef==2) dcwavef=zero 701 ! Number of one-way 3D ffts skipped 702 nskip=nskip+nline 703 704 ! At the end of the treatment of a set of bands, write the number of one-way 3D ffts skipped 705 if (xmpi_paral==1 .and. u1_band==nband .and. prtvol>=10) then 706 write(msg,'(a,i0)')' dfpt_cgwf: number of one-way 3D ffts skipped in cgwf3 until now =',nskip 707 call wrtout(std_out,msg) 708 end if 709 710 skipme = 1 711 end if 712 713 ! If not a buffer band, perform the optimisation 714 715 ABI_MALLOC(conjgr,(2,npw1*nspinor)) 716 ABI_MALLOC(direc,(2,npw1*nspinor)) 717 ABI_MALLOC(gresid,(2,npw1*nspinor)) 718 ABI_MALLOC(cwaveq,(2,npw1*nspinor)) 719 if (usepaw==1) then 720 ABI_MALLOC(conjgrprj,(natom,nspinor)) 721 call pawcprj_alloc(conjgrprj,0,gs_hamkq%dimcprj) 722 else 723 ABI_MALLOC(conjgrprj,(0,0)) 724 end if 725 726 cwaveq(:,:)=cgq(:,1+npw1*nspinor*(band_me-1)+icgq:npw1*nspinor*band_me+icgq) 727 dotgp=one 728 729 ! Here apply H(0) at k+q to input orthogonalized 1st-order wfs 730 sij_opt=0;if (gen_eigenpb) sij_opt=1 731 cpopt=-1+usepaw 732 call getghc(cpopt,cwavef,cwaveprj,ghc,gsc,gs_hamkq,gvnlxc,eshift,mpi_enreg,1,& 733 prtvol,sij_opt,tim_getghc,0,select_k=KPRIME_H_KPRIME) 734 735 ! ghc also includes the eigenvalue shift 736 if (gen_eigenpb) then 737 call cg_zaxpy(npw1*nspinor, [-eshift, zero], gsc,ghc) 738 else 739 call cg_zaxpy(npw1*nspinor, [-eshift, zero], cwavef,ghc) 740 end if 741 742 ! Initialize resid, in case of nline==0 743 resid=zero 744 745 bands_skipped_now = 0 746 bands_skipped_now(u1_band) = skipme 747 call xmpi_sum(bands_skipped_now,mpi_enreg%comm_band,ierr) 748 749 ! ====================================================================== 750 ! ====== BEGIN LOOP FOR A GIVEN BAND: MINIMIZATION ITERATIONS ========== 751 ! ====================================================================== 752 do iline=1,nline 753 754 755 ! ====================================================================== 756 ! ================= COMPUTE THE RESIDUAL =============================== 757 ! ====================================================================== 758 ! Note that gresid (=steepest-descent vector, Eq.(26) of PRB 55, 10337 (1996) [[cite:Gonze1997]]) 759 ! is precomputed to guarantee cancellation of errors 760 ! and allow residuals to reach values as small as 1.0d-24 or better. 761 if (berryopt== 4.or.berryopt== 6.or.berryopt== 7.or. berryopt==14.or.berryopt==16.or.berryopt==17) then 762 if (ipert==natom+2) then 763 if (opt_gvnlx1/=1) gvnlx1=zero 764 !$OMP PARALLEL DO 765 do ipw=1,npw1*nspinor 766 gresid(1:2,ipw)=-ghc(1:2,ipw)-gh1c(1:2,ipw) 767 end do 768 else 769 !$OMP PARALLEL DO 770 do ipw=1,npw1*nspinor 771 gresid(1,ipw)=-ghc(1,ipw)-gh1c(1,ipw)+gberry(2,ipw) 772 gresid(2,ipw)=-ghc(2,ipw)-gh1c(2,ipw)-gberry(1,ipw) 773 end do 774 end if 775 else 776 !$OMP PARALLEL DO 777 do ipw=1,npw1*nspinor 778 gresid(1:2,ipw)=-ghc(1:2,ipw)-gh1c(1:2,ipw) 779 end do 780 end if 781 782 ! ====================================================================== 783 ! =========== PROJECT THE STEEPEST DESCENT DIRECTION =================== 784 ! ========= OVER THE SUBSPACE ORTHOGONAL TO OTHER BANDS ================ 785 ! ====================================================================== 786 ! Project all bands from gresid into direc: 787 ! The following projection over the subspace orthogonal to occupied bands 788 ! is not optional in the RF case, unlike the GS case. 789 ! However, the order of operations could be changed, so that 790 ! as to make it only applied at the beginning, to H(1) psi(0), 791 ! so, THIS IS TO BE REEXAMINED 792 ! Note the subtlety: 793 ! -For the generalized eigenPb, S|cgq> is used in place of |cgq>, 794 ! in order to apply P_c+ projector (see PRB 73, 235101 (2006) [[cite:Audouze2006]], Eq. (71), (72) 795 do iband = 1, nband 796 if (bands_treated_now(iband) == 0) cycle 797 if (rank_band(iband) == me_band) work = gresid 798 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 799 800 if(gen_eigenpb)then 801 call projbd(gscq,work,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 802 cgq, scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 803 else 804 call projbd( cgq,work,-1, icgq, 0,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 805 dummy,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 806 end if 807 808 call xmpi_sum_master(work, rank_band(iband), mpi_enreg%comm_band, ierr) 809 810 ! save this for me_band only 811 !TODO: make this a blas call? zaxpy 812 if (rank_band(iband) == me_band) then 813 gresid = work - (mpi_enreg%nproc_band-1)*gresid 814 end if 815 end do 816 817 call cg_zcopy(npw1*nspinor,gresid,direc) 818 819 ! ====================================================================== 820 ! ============== CHECK FOR CONVERGENCE CRITERIA ======================== 821 ! ====================================================================== 822 823 ! Compute second-order derivative of the energy using a variational expression 824 call dotprod_g(prod1,doti,istwf_k,npw1*nspinor,1,cwavef,gresid,me_g0,mpi_enreg%comm_spinorfft) 825 call dotprod_g(prod2,doti,istwf_k,npw1*nspinor,1,cwavef,gh1c,me_g0,mpi_enreg%comm_spinorfft) 826 d2te=two*(-prod1+prod2) 827 ! write(std_out,'(a,f14.6,a,f14.6)') 'prod1 = ',prod1,' prod2 = ',prod2 ! Keep this debugging feature! 828 829 ! Compute <u_m(1)|H(0)-e_m(0)|u_m(1)> 830 ! (<u_m(1)|H(0)-e_m(0).S|u_m(1)> if gen. eigenPb), 831 ! that should be positive, 832 ! except when the eigenvalue eig_mk(0) is higher than 833 ! the lowest non-treated eig_mk+q(0). For insulators, this 834 ! has no influence, but for metallic occupations, 835 ! the conjugate gradient algorithm breaks down. The solution adopted here 836 ! is very crude, and rely upon the fact that occupancies of such 837 ! levels should be smaller and smaller with increasing nband, so that 838 ! a convergence study will give the right result. 839 ! The same trick is also used later. 840 u1h0me0u1=-prod1-prod2 841 842 ! Some tolerance is allowed, to account for very small numerical inaccuracies and cancellations. 843 if(u1h0me0u1<-tol_restart) then ! .and. skipme == 0)then 844 if (prtvol==-level.or.prtvol==-19) then 845 write(msg,'(a,es22.13e3)') ' cgwf3: u1h0me0u1 = ',u1h0me0u1 846 call wrtout(std_out,msg) 847 end if 848 cwavef =zero 849 ghc =zero 850 gvnlxc =zero 851 if (gen_eigenpb) gsc(:,:)=zero 852 if (usepaw==1) then 853 call pawcprj_set_zero(cwaveprj) 854 end if 855 ! A negative residual will be the signal of this problem ... 856 resid=-one 857 if (prtvol > 0) call wrtout(std_out,' dfpt_cgwf: problem of minimisation (likely metallic), set resid to -1') 858 ! Number of one-way 3D ffts skipped 859 nskip=nskip+(nline-iline+1) 860 861 skipme = 1 862 end if 863 864 if (skipme == 0) then 865 ! Compute residual (squared) norm 866 call sqnorm_g(resid,istwf_k,npw1*nspinor,gresid,me_g0,comm_fft) 867 if (prtvol==-level.or.prtvol==-19)then 868 write(msg,'(a,a,i3,f14.6,a,a,4es12.4)') ch10,& 869 ' dfpt_cgwf : iline,eshift =',iline,eshift,ch10,& 870 ' resid,prod1,prod2,d2te=',resid,prod1,prod2,d2te 871 call wrtout(std_out,msg) 872 end if 873 end if 874 875 ! If residual sufficiently small stop line minimizations 876 if (resid<tolwfr .and. skipme == 0) then 877 if(prtvol>=10)then 878 write(msg,'(a,i4,a,i2,a,es12.4)')& 879 ' dfpt_cgwf: band',u1_band,' converged after ',iline,' line minimizations: resid = ',resid 880 call wrtout(std_out,msg) 881 end if 882 nskip=nskip+(nline-iline+1) ! Number of two-way 3D ffts skipped 883 skipme = 1 884 end if 885 886 ! If user require exiting the job, stop line minimisations 887 if (quit==1) then 888 write(msg,'(a,i0)')' dfpt_cgwf: user require exiting => skip update of band ',u1_band 889 call wrtout(std_out,msg) 890 nskip=nskip+(nline-iline+1) ! Number of two-way 3D ffts skipped 891 exit ! Exit from the loop on iline 892 end if 893 894 ! Check that d2te is decreasing on succeeding lines: 895 if (iline/=1) then 896 if (d2te>d2teold+tol6 .and. u1_band_ > 0) then 897 write(msg,'(a,i0,a,e14.6,a,e14.6)')'New trial energy at line ',iline,' = ',d2te,'is higher than former: ',d2teold 898 ABI_WARNING(msg) 899 end if 900 end if 901 d2teold=d2te 902 903 !write(msg, *)"iline, resid, skipme", iline, resid, skipme; call wrtout(std_out, msg) 904 905 !DEBUG Keep this debugging feature ! 906 !call sqnorm_g(dotr,istwf_k,npw1*nspinor,direc,me_g0,comm_fft) 907 !write(std_out,*)' dfpt_cgwf : before precon, direc**2=',dotr 908 !if (gen_eigenpb) then 909 !call dotprod_g(dotr,doti,istwf_k,npw1*nspinor,1,cwaveq,& 910 !& gscq(:,1+npw1*nspinor*(u1_band-1)+igscq:npw1*nspinor*u1_band+igscq),me_g0,mpi_enreg%comm_spinorfft) 911 !else 912 !call sqnorm_g(dotr,istwf_k,npw1*nspinor,cwaveq,me_g0,comm_fft) 913 !end if 914 !write(std_out,*)' dfpt_cgwf : before precon, cwaveq**2=',dotr 915 !ENDDEBUG 916 917 ! ====================================================================== 918 ! ======== PRECONDITION THE STEEPEST DESCENT DIRECTION ================= 919 ! ====================================================================== 920 921 ! If wfoptalg>=10, the precondition matrix is kept constant 922 ! during iteration; otherwise it is recomputed 923 if (wfoptalg<10.or.iline==1) then 924 !print *, 'cwaveq cg_precon ', cwaveq; print *, 'direc cg_precon ', direc 925 call cg_precon(cwaveq,zero,istwf_k,kinpw1,npw1,nspinor,me_g0,0,pcon,direc,mpi_enreg%comm_fft) 926 else 927 do ispinor=1,nspinor 928 igs=(ispinor-1)*npw1 929 !$OMP PARALLEL DO PRIVATE(ipw) SHARED(igs,npw,direc,pcon) 930 do ipw=1+igs,npw1+igs 931 direc(1:2,ipw)=direc(1:2,ipw)*pcon(ipw-igs) 932 end do 933 end do 934 end if 935 936 !DEBUG Keep this debugging feature ! 937 !call sqnorm_g(dotr,istwf_k,npw1*nspinor,direc,me_g0,comm_fft) 938 !write(std_out,*)' dfpt_cgwf : after precon, direc**2=',dotr 939 !ENDDEBUG 940 941 ! ====================================================================== 942 ! ======= PROJECT THE PRECOND. STEEPEST DESCENT DIRECTION ============== 943 ! ========= OVER THE SUBSPACE ORTHOGONAL TO OTHER BANDS ================ 944 ! ====================================================================== 945 946 ! Projecting again out all bands: 947 ! -For the simple eigenPb, gscq is used as dummy argument 948 do iband = 1, nband 949 if (bands_treated_now(iband) == 0) cycle 950 951 if (rank_band(iband) == me_band) work = direc 952 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 953 954 call projbd(cgq,work,-1,icgq,igscq,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 955 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 956 957 call xmpi_sum_master(work, rank_band(iband), mpi_enreg%comm_band, ierr) 958 959 ! save this for me_band only 960 !TODO: make this a blas call? zaxpy 961 if (rank_band(iband) == me_band) then 962 direc = work - (mpi_enreg%nproc_band-1)*direc 963 end if 964 end do 965 966 967 !DEBUG Keep this debugging feature ! 968 !call sqnorm_g(dotr,istwf_k,npw1*nspinor,direc,me_g0,comm_fft) 969 !write(std_out,*)' dfpt_cgwf : after projbd, direc**2=',dotr 970 !ENDDEBUG 971 972 ! ====================================================================== 973 ! ================= COMPUTE THE CONJUGATE-GRADIENT ===================== 974 ! ====================================================================== 975 976 ! get dot of direction vector with residual vector 977 call dotprod_g(dotgg,doti,istwf_k,npw1*nspinor,1,direc,gresid,me_g0,mpi_enreg%comm_spinorfft) 978 979 if (iline==1) then 980 ! At first iteration, gamma is set to zero 981 gamma=zero 982 dotgp=dotgg 983 call cg_zcopy(npw1*nspinor,direc,conjgr) 984 else 985 ! At next iterations, h = g + gamma * h 986 gamma=dotgg/dotgp 987 dotgp=dotgg 988 if (prtvol==-level.or.prtvol==-19)then 989 write(msg,'(a,2es16.6)') 'dfpt_cgwf: dotgg,gamma = ',dotgg,gamma 990 call wrtout(std_out,msg) 991 end if 992 !$OMP PARALLEL DO 993 do ipw=1,npw1*nspinor 994 conjgr(1:2,ipw)=direc(1:2,ipw)+gamma*conjgr(1:2,ipw) 995 end do 996 if (prtvol==-level.or.prtvol==-19) call wrtout(std_out,'dfpt_cgwf: conjugate direction has been found') 997 end if 998 999 ! ====================================================================== 1000 ! ===== COMPUTE CONTRIBUTIONS TO 1ST AND 2ND DERIVATIVES OF ENERGY ===== 1001 ! ====================================================================== 1002 ! ...along the search direction 1003 1004 ! Compute dedt, Eq.(29) of of PRB55, 10337 (1997) [[cite:Gonze1997]], 1005 ! with an additional factor of 2 for the difference between E(2) and the 2DTE 1006 dedt = zero 1007 call dotprod_g(dedt,doti,istwf_k,npw1*nspinor,1,conjgr,gresid,me_g0,mpi_enreg%comm_spinorfft) 1008 dedt=-two*two*dedt 1009 1010 if((prtvol==-level.or.prtvol==-19.or.prtvol==-20).and.dedt-tol14>0) then 1011 call wrtout(std_out,' DFPT_CG_WARNING : dedt>0') 1012 end if 1013 ABI_MALLOC(gvnlx_direc,(2,npw1*nspinor)) 1014 ABI_MALLOC(gh_direc,(2,npw1*nspinor)) 1015 if (gen_eigenpb) then 1016 ABI_MALLOC(sconjgr,(2,npw1*nspinor)) 1017 else 1018 ABI_MALLOC(sconjgr,(0,0)) 1019 end if 1020 sij_opt=0;if (gen_eigenpb) sij_opt=1 1021 cpopt=-1+usepaw 1022 call getghc(cpopt,conjgr,conjgrprj,gh_direc,sconjgr,gs_hamkq,gvnlx_direc,& 1023 eshift,mpi_enreg,1,prtvol,sij_opt,tim_getghc,0,select_k=KPRIME_H_KPRIME) 1024 1025 ! ghc also includes the eigenvalue shift 1026 if (gen_eigenpb) then 1027 !$OMP PARALLEL DO 1028 do ipw=1,npw1*nspinor 1029 gh_direc(1:2,ipw)=gh_direc(1:2,ipw)-eshift*sconjgr(1:2,ipw) 1030 end do 1031 else 1032 !$OMP PARALLEL DO 1033 do ipw=1,npw1*nspinor 1034 gh_direc(1:2,ipw)=gh_direc(1:2,ipw)-eshift*conjgr(1:2,ipw) 1035 end do 1036 end if 1037 1038 ! compute d2edt2, Eq.(30) of of PRB55, 10337 (1997) [[cite:Gonze1997]], 1039 ! with an additional factor of 2 for the difference 1040 ! between E(2) and the 2DTE, and neglect of local fields (SC terms) 1041 d2edt2 = zero 1042 call dotprod_g(d2edt2,doti,istwf_k,npw1*nspinor,1,conjgr,gh_direc,me_g0,mpi_enreg%comm_spinorfft) 1043 d2edt2=two*two*d2edt2 1044 1045 if(prtvol==-level.or.prtvol==-19)then 1046 write(msg,'(a,2es14.6)') 'dfpt_cgwf: dedt,d2edt2=',dedt,d2edt2 1047 call wrtout(std_out,msg) 1048 end if 1049 1050 ! ====================================================================== 1051 ! ======= COMPUTE MIXING FACTOR - CHECK FOR CONVERGENCE =============== 1052 ! ====================================================================== 1053 1054 1055 ! see Eq.(31) of PRB55, 10337 (1997) [[cite:Gonze1997]] 1056 ! 1057 if(d2edt2<-tol_restart .and. skipme == 0)then 1058 ! This may happen when the eigenvalue eig_mk(0) is higher than 1059 ! the lowest non-treated eig_mk+q(0). The solution adopted here 1060 ! is very crude, and rely upon the fact that occupancies of such 1061 ! levels should be smaller and smaller with increasing nband, so that 1062 ! a convergence study will give the right result. 1063 theta=zero 1064 1065 cwavef=zero 1066 ghc =zero 1067 gvnlxc =zero 1068 if (gen_eigenpb) gsc=zero 1069 if (usepaw==1) then 1070 call pawcprj_set_zero(cwaveprj) 1071 end if 1072 ! A negative residual will be the signal of this problem ... 1073 resid=-two 1074 if (prtvol > 0 .and. u1_band_ > 0) then 1075 call wrtout(std_out,' dfpt_cgwf: problem of minimisation (likely metallic), set resid to -2') 1076 end if 1077 else if (d2edt2 > 1.d-40) then 1078 ! Here, the value of theta that gives the minimum 1079 theta=-dedt/d2edt2 1080 !write(std_out,*)' dfpt_cgwf: dedt,d2edt2=',dedt,d2edt2 1081 else 1082 if (u1_band_ > 0) call wrtout(std_out, "DFPT_CGWF WARNING: d2edt2 is zero, skipping update") 1083 theta=zero 1084 end if 1085 1086 ! Check that result is above machine precision 1087 if (one+theta==one) then 1088 if (prtvol > 0 .and. u1_band_ > 0) then 1089 write(msg, '(a,es16.4)' )' dfpt_cgwf: converged with theta= ',theta 1090 call wrtout(std_out,msg) 1091 end if 1092 nskip=nskip+2*(nline-iline) ! Number of one-way 3D ffts skipped 1093 skipme = 1 1094 end if 1095 1096 ! ====================================================================== 1097 ! ================ GENERATE NEW |wf>, H|wf>, Vnl|Wf ... ================ 1098 ! ====================================================================== 1099 1100 if (skipme == 0) then 1101 call cg_zaxpy(npw1*nspinor, [theta, zero], conjgr,cwavef) 1102 ! Filter the wavefunctions for large modified kinetic energy (see routine mkkin.f) 1103 do ispinor=1,nspinor 1104 ipws=(ispinor-1)*npw1 1105 !$OMP PARALLEL DO PRIVATE(ipw) SHARED(cwavef,kinpw1,ipws,npw1) 1106 do ipw=1+ipws,npw1+ipws 1107 if(kinpw1(ipw-ipws)>huge(zero)*1.d-11)then 1108 cwavef(1:2,ipw)=zero 1109 end if 1110 end do 1111 end do 1112 1113 call cg_zaxpy(npw1*nspinor, [theta, zero], gh_direc,ghc) 1114 call cg_zaxpy(npw1*nspinor, [theta, zero], gvnlx_direc,gvnlxc) 1115 1116 if (gen_eigenpb) then 1117 call cg_zaxpy(npw1*nspinor, [theta, zero], sconjgr, gsc) 1118 end if 1119 if (usepaw==1) then 1120 call pawcprj_axpby(theta,one,conjgrprj,cwaveprj) 1121 end if 1122 end if 1123 1124 ABI_FREE(gh_direc) 1125 ABI_FREE(gvnlx_direc) 1126 ABI_FREE(sconjgr) 1127 1128 ! ====================================================================== 1129 ! =========== CHECK CONVERGENCE AGAINST TRIAL ENERGY =================== 1130 ! ====================================================================== 1131 1132 if(usetolrde/=0) then 1133 ! Check reduction in trial energy deltae, Eq.(28) of PRB55, 10337 (1997) [[cite:Gonze1997]] 1134 deltae=half*d2edt2*theta**2+theta*dedt 1135 1136 if (iline==1) then 1137 deold=deltae 1138 ! The extra factor of two should be removed ! 1139 else if (abs(deltae)<tolrde*two*abs(deold) .and. iline/=nline) then 1140 if(prtvol>=10.or.prtvol==-level.or.prtvol==-19)then 1141 write(msg, '(a,i4,1x,a,1p,e12.4,a,e12.4,a)' ) & 1142 ' dfpt_cgwf: line',iline,' deltae=',deltae,' < tolrde*',deold,' =>skip lines' 1143 call wrtout(std_out,msg) 1144 end if 1145 nskip=nskip+2*(nline-iline) ! Number of one-way 3D ffts skipped 1146 skipme = 1 1147 end if 1148 end if 1149 1150 ! if all bands are skippable, we can exit the iline loop for good. 1151 ! otherwise, all procs are needed for the projbd and other operations, 1152 ! even if the present band will not be updated 1153 bands_skipped_now = 0 1154 bands_skipped_now(u1_band) = skipme 1155 if (u1_band_ < 0) bands_skipped_now(u1_band) = 0 ! Handle negative index 1156 call xmpi_sum(bands_skipped_now,mpi_enreg%comm_band,ierr) 1157 1158 ! bands_skipped_now = bands_skipped_now - bands_treated_now 1159 if (sum(abs(bands_skipped_now - bands_treated_now)) == 0) then 1160 exit 1161 end if 1162 1163 nlines_done = nlines_done + 1 1164 end do ! iline 1165 1166 !-------------------------------------------------------------------------- 1167 ! DEBUG 1168 !-------------------------------------------------------------------------- 1169 ! Check that final cwavef (Psi^(1)) satisfies the orthogonality condition 1170 if (prtvol==-level.or.prtvol==-19) then 1171 sij_opt=0 ; usevnl=1 ; optlocal=1 ; optnl=2 ; if (gen_eigenpb) sij_opt=1 1172 ABI_MALLOC(work1,(2,npw1*nspinor*((sij_opt+1)/2))) 1173 ABI_MALLOC(work2,(2,npw1*nspinor*sij_opt)) 1174 iband_me = 0 1175 do iband=1,nband 1176 if (bands_treated_now(iband) == 0) cycle 1177 if (rank_band(iband) == me_band) then 1178 iband_me = iband_me+1 1179 if (gen_eigenpb) then 1180 work(:,:)=gscq(:,1+npw1*nspinor*(iband-1)+igscq:npw1*nspinor*iband+igscq) 1181 else 1182 work(:,:)=cgq(:,1+npw1*nspinor*(iband-1)+icgq:npw1*nspinor*iband+icgq) 1183 end if 1184 end if 1185 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 1186 1187 ! Compute: <Psi^(0)_i,k+q|Psi^(1)_j,k,q> 1188 call dotprod_g(prod1,prod2,istwf_k,npw1*nspinor,2,work,cwavef,me_g0,mpi_enreg%comm_spinorfft) 1189 1190 if (ipert/=natom+10.and.ipert/=natom+11) then 1191 if (gen_eigenpb) then 1192 call getgh1c(berryopt,cwave0,cwaveprj0,work1,gberry,work2,gs_hamkq,gvnlx1_saved,idir,ipert,eshift,& 1193 mpi_enreg,optlocal,optnl,opt_gvnlx1,rf_hamkq,sij_opt,tim_getgh1c,usevnl) 1194 1195 if (rank_band(iband) == me_band) then 1196 work(:,:)=cgq(:,1+npw1*nspinor*(iband_me-1)+icgq:npw1*nspinor*iband_me+icgq) 1197 end if 1198 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 1199 call dotprod_g(dotr,doti,istwf_k,npw1*nspinor,2,work,work2,me_g0,mpi_enreg%comm_spinorfft) 1200 else 1201 dotr=zero; doti=zero 1202 end if 1203 dotr=prod1+half*dotr 1204 doti=prod2+half*doti 1205 else if(prtvol==-19) then ! 2nd order case 1206 dotr=prod1+half*rf2%amn(1,iband+(u1_band-1)*nband) 1207 doti=prod2+half*rf2%amn(2,iband+(u1_band-1)*nband) 1208 else 1209 write(msg,'(a)') 'CGWF3_WARNING : Use prtvol=-19 to test orthogonality for ipert=natom+10 or +11' 1210 call wrtout(std_out,msg,'COLL') 1211 end if 1212 dotr=sqrt(dotr**2+doti**2) 1213 if(dotr>tol10) then 1214 !if (gen_eigenpb) then 1215 ! write(msg,'(2a,i3,a,2es22.15)') 'CGWF3_WARNING : <Psi^(1)_i,k,q|S^(0)|Psi^(0)_j,k+q>',& 1216 ! '+ 1/2<Psi^(0)_i,k|S^(1)|Psi^(0)_j,k+q>, for j= ',iband,' is ',dotr,doti 1217 !else 1218 write(msg,'(a,i3,a,es22.15)') 'CGWF3_WARNING : |<Psi^(0)_i,k+q|Psi^(1)_j,k,q>+amn(i,j)/2|, for j= ',iband,' is ',dotr 1219 !end if 1220 call wrtout(std_out,msg) 1221 end if 1222 end do 1223 ABI_FREE(work1) 1224 ABI_FREE(work2) 1225 if (ipert/=natom+10.and.ipert/=natom+11) then 1226 ABI_FREE(gvnlx1_saved) 1227 end if 1228 end if ! prtvol==-level.or.prtvol==-19 1229 1230 if (prtvol==-level.or.prtvol==-19)then 1231 ! Check that final cwavef Psi^(1) is Pc.Psi^(1)+delta_Psi^(1) 1232 ABI_MALLOC(cwwork,(2,npw1*nspinor)) 1233 ! -Apply Pc to Psi^(1) 1234 do iband = 1, nband 1235 if (bands_treated_now(iband) == 0) cycle 1236 if (rank_band(iband) == me_band) cwwork=cwavef 1237 1238 call xmpi_bcast(cwwork,rank_band(iband),mpi_enreg%comm_band,ierr) 1239 1240 if(gen_eigenpb)then 1241 call projbd(cgq,cwwork,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 1242 cgq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1243 else 1244 call projbd(cgq,cwwork,-1,icgq,0,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 1245 dummy,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1246 end if 1247 1248 call xmpi_sum(cwwork,mpi_enreg%comm_band,ierr) 1249 1250 ! save this for me_band only 1251 !TODO: make this a blas call? zaxpy 1252 if (rank_band(iband) == me_band) then 1253 cwwork = cwwork - (mpi_enreg%nproc_band-1)*cwavef 1254 end if 1255 end do 1256 1257 ! -Add delta_Psi^(1) 1258 if (usedcwavef>0) cwwork=cwwork+dcwavef 1259 if(ipert==natom+10.or.ipert==natom+11) cwwork=cwwork+rf2%dcwavef(:,1+dc_shift_band:npw1*nspinor+dc_shift_band) 1260 ! -Compare to Psi^(1) 1261 cwwork=cwwork-cwavef 1262 call sqnorm_g(dotr,istwf_k,npw1*nspinor,cwwork,me_g0,comm_fft) 1263 ABI_FREE(cwwork) 1264 if(sqrt(dotr)>tol10) then 1265 !if (gen_eigenpb) then 1266 ! write(msg,'(a,i3,a,es22.15)') & 1267 ! 'CGWF3_WARNING : |(Pc.Psi^(1)_i,k,q + delta_Psi^(1)_i,k) - Psi^(1)_i,k,q|^2 (band ',u1_band,')=',dotr 1268 !else 1269 write(msg,'(a,es22.15)') 'CGWF3_WARNING : |(Pc.Psi^(1)_i,k,q + delta_Psi^(1)_i,k) - Psi^(1)_i,k,q| = ',sqrt(dotr) 1270 !end if 1271 call wrtout(std_out,msg) 1272 end if 1273 end if ! prtvol==-level.or.prtvol==-19 1274 1275 if(prtvol==-level.or.prtvol==-19.or.prtvol==-20)then 1276 ! Check that final cwavef (Psi^(1)) solves the Sternheimer equation 1277 ABI_MALLOC(cwwork,(2,npw1*nspinor)) 1278 ! -Apply Pc to Psi^(1) 1279 do iband = 1, nband 1280 if (bands_treated_now(iband) == 0) cycle 1281 1282 if (rank_band(iband) == me_band) cwwork=cwavef 1283 call xmpi_bcast(cwwork,rank_band(iband),mpi_enreg%comm_band,ierr) 1284 1285 if(gen_eigenpb)then 1286 call projbd(cgq,cwwork,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 1287 gscq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1288 else 1289 call projbd(cgq,cwwork,-1,icgq,0,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 1290 dummy,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1291 end if 1292 1293 call xmpi_sum_master(cwwork, rank_band(iband), mpi_enreg%comm_band, ierr) 1294 1295 ! save this for me_band only 1296 !TODO: make this a blas call? zaxpy 1297 if (rank_band(iband) == me_band) then 1298 cwwork = cwwork - (mpi_enreg%nproc_band-1)*cwavef 1299 end if 1300 end do 1301 1302 ! - Apply H^(0)-E.S^(0) 1303 sij_opt=0;if (gen_eigenpb) sij_opt=1 1304 cpopt=-1 1305 ABI_MALLOC(work1,(2,npw1*nspinor*((sij_opt+1)/2))) 1306 ABI_MALLOC(work2,(2,npw1*nspinor)) 1307 call getghc(cpopt,cwwork,conjgrprj,work,work1,gs_hamkq,work2,eshift,& 1308 mpi_enreg,1,prtvol,sij_opt,tim_getghc,0,select_k=KPRIME_H_KPRIME) 1309 if (gen_eigenpb) then 1310 cwwork=work-eshift*work1 1311 else 1312 cwwork=work-eshift*cwwork 1313 end if 1314 ABI_FREE(work1) 1315 ABI_FREE(work2) 1316 1317 ! The following is not mandatory, as Pc has been already applied to Psi^(1) 1318 ! and Pc^* H^(0) Pc = Pc^* H^(0) = H^(0) Pc (same for S^(0)). 1319 ! However, in PAW, to apply Pc^* here seems to reduce the numerical error 1320 ! -Apply Pc^* 1321 do iband = 1, nband 1322 if (bands_treated_now(iband) == 0) cycle 1323 if (rank_band(iband) == me_band) work=cwwork 1324 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 1325 1326 if(gen_eigenpb)then 1327 call projbd(gscq, work,-1,igscq,icgq,istwf_k,mgscq,mcgq,nband_me,npw1,nspinor,& 1328 cgq,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1329 else 1330 call projbd(cgq,work,-1,icgq,0,istwf_k,mcgq,mgscq,nband_me,npw1,nspinor,& 1331 dummy,scprod,0,tim_projbd,useoverlap,me_g0,comm_fft) 1332 end if 1333 1334 call xmpi_sum(work,mpi_enreg%comm_band,ierr) 1335 1336 ! save this for me_band only 1337 !TODO: make this a blas call? zaxpy 1338 if (rank_band(iband) == me_band) then 1339 cwwork = cwwork - (mpi_enreg%nproc_band-1)*work 1340 end if 1341 end do 1342 1343 ! - Add Pc^*(H^(1)-E.S^(1)).Psi^(0) 1344 cwwork=cwwork+gh1c 1345 call sqnorm_g(dotr,istwf_k,npw1*nspinor,cwwork,me_g0,comm_fft) 1346 ABI_FREE(cwwork) 1347 write(msg,'(a,i3,a,es22.15,2a,i4)') & 1348 '*** CGWF3 Sternheimer equation test for band ',u1_band,'=',sqrt(dotr),ch10,& 1349 'It should go to zero for large nline : nlines_done = ',nlines_done 1350 call wrtout(std_out,msg) 1351 end if ! prtvol==-level.or.prtvol==-19.or.prtvol==-20 1352 1353 if(prtvol==-level.or.prtvol==-19.or.prtvol==-20)then 1354 ! Check that < Psi^(0) | ( H^(0)-eps^(0) S^(0) ) | Psi^(1) > is in agreement with eig^(1) 1355 ABI_MALLOC(cwwork,(2,npw1*nspinor)) 1356 cwwork=cwavef 1357 ! - Apply H^(0)-E.S^(0) 1358 sij_opt=0;if (gen_eigenpb) sij_opt=1 1359 cpopt=-1 1360 ABI_MALLOC(work1,(2,npw1*nspinor*((sij_opt+1)/2))) 1361 ABI_MALLOC(work2,(2,npw1*nspinor)) 1362 call getghc(cpopt,cwwork,conjgrprj,work,work1,gs_hamkq,work2,eshift,& 1363 mpi_enreg,1,prtvol,sij_opt,tim_getghc,0,select_k=KPRIME_H_KPRIME) 1364 if (gen_eigenpb) then 1365 cwwork=work-eshift*work1 1366 else 1367 cwwork=work-eshift*cwwork 1368 end if 1369 ABI_FREE(work1) 1370 ABI_FREE(work2) 1371 cwwork=cwwork+gh1c_n 1372 jband=(u1_band-1)*2*nband 1373 iband_me = 0 1374 do iband=1,nband 1375 if (bands_treated_now(iband) == 0) cycle 1376 if (rank_band(iband) == me_band) then 1377 iband_me = iband_me+1 1378 work(:,:)=cgq(:,1+npw1*nspinor*(iband_me-1)+icgq:npw1*nspinor*iband_me+icgq) 1379 end if 1380 call xmpi_bcast(work,rank_band(iband),mpi_enreg%comm_band,ierr) 1381 1382 call dotprod_g(dotr,doti,istwf_k,npw1*nspinor,2,work,cwwork,me_g0,mpi_enreg%comm_spinorfft) 1383 dotr = dotr - eig1_k(2*iband-1+jband) 1384 doti = doti - eig1_k(2*iband +jband) 1385 dotr = sqrt(dotr**2+doti**2) 1386 if (dotr > tol8) then 1387 write(msg,'(2(a,i3),a,es22.15)') & 1388 'CGWF3_WARNING < Psi^(0) | ( H^(0)-eps^(0) S^(0) ) | Psi^(1) > for i=',iband,' j=',u1_band,& 1389 ' : ',sqrt(dotr**2+doti**2) 1390 call wrtout(std_out,msg) 1391 end if 1392 end do 1393 !write(std_out,'(a)') '< Psi^(0) | ( H^(0)-eps^(0) S^(0) ) | Psi^(1) > is done.' 1394 ABI_FREE(cwwork) 1395 end if ! prtvol==-level.or.prtvol==-19.or.prtvol==-20 1396 !-------------------------------------------------------------------------- 1397 ! END DEBUG 1398 !-------------------------------------------------------------------------- 1399 1400 if (allocated(gh_direc)) then 1401 ABI_FREE(gh_direc) 1402 end if 1403 if (allocated(gvnlx_direc)) then 1404 ABI_FREE(gvnlx_direc) 1405 end if 1406 ABI_FREE(conjgr) 1407 ABI_FREE(cwaveq) 1408 ABI_FREE(direc) 1409 ABI_FREE(gresid) 1410 if (usepaw==1) then 1411 call pawcprj_free(conjgrprj) 1412 end if 1413 ABI_FREE(conjgrprj) 1414 1415 if(u1_band>max(1,nband-nbdbuf))then 1416 ! A small negative residual will be associated with these 1417 ! in the present algorithm all bands need to be in the loops over cgq etc... for the parallelization 1418 resid=-0.1_dp 1419 end if 1420 1421 ! At the end of the treatment of a set of bands, write the number of one-way 3D ffts skipped 1422 if (xmpi_paral==1 .and. u1_band==nband .and. prtvol>=10) then 1423 write(msg,'(a,i0)')' dfpt_cgwf: number of one-way 3D ffts skipped in cgwf3 until now =',nskip 1424 call wrtout(std_out,msg) 1425 end if 1426 1427 ABI_FREE(work) 1428 ABI_FREE(gh1c) 1429 ABI_FREE(pcon) 1430 ABI_FREE(scprod) 1431 ABI_FREE(gberry) 1432 1433 call timab(122,2,tsec) 1434 1435 DBG_EXIT("COLL") 1436 1437 end subroutine dfpt_cgwf
ABINIT/m_dfpt_cgwf [ Modules ]
NAME
m_dfpt_cgwf
FUNCTION
Update one single wavefunction (cwavef), non self-consistently. Uses a conjugate-gradient algorithm.
COPYRIGHT
Copyright (C) 1999-2024 ABINIT group (XG,DRH,XW,FJ,MT,LB) This file is distributed under the terms of the GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt .
SOURCE
17 #if defined HAVE_CONFIG_H 18 #include "config.h" 19 #endif 20 21 #include "abi_common.h" 22 23 module m_dfpt_cgwf 24 25 use defs_basis 26 use m_abicore 27 use m_errors 28 use m_xmpi 29 use m_cgtools 30 use m_rf2 31 32 use defs_abitypes, only : MPI_type 33 use m_time, only : timab 34 use m_pawcprj, only : pawcprj_type, pawcprj_alloc, pawcprj_free, pawcprj_set_zero, pawcprj_axpby 35 use m_hamiltonian, only : gs_hamiltonian_type, rf_hamiltonian_type, KPRIME_H_KPRIME 36 use m_getghc, only : getghc 37 use m_getgh1c, only : getgh1c, getdc1 38 39 implicit none 40 41 private