Derivation of the Average Force in the Cannonical and Isothermal-Isobaric Ensemble

In the canonical or NVT ensemble the number of particles, volume, and temperature are fixed. We have microstates r3N and macrostates Rm and a coarse-graining function 𝒻3Nm. Each microstate has an energy defined by E(N,V,T,r)×𝒫(3)××3N. In the canonical ensemble the Hemholtz free energy defined to be:

F(N,V,T,R)U(N,V,T,R)TS(N,V,T,R)

where

U(N,V,T,R)E(N,V,T,r)p(r|R)=1Z(R)VNeE(N,V,T,r)kBTE(r)δ(𝒻(r)R)dr
S(N,V,T,R)kBln(p(r|R,N,V,T)p(r|R)=kBZ(R)VNeE(N,V,T,r)kBTln(1Z(R)eE(N,V,T,r)kBT)δ(𝒻(r)R)dr

When assuming entropy is maximized it can be derived that the probability of being in a microstate is

p(r|R,N,V,T)=1Z(R)eE(N,V,T,r)kBTδ(𝒻(r)R)

And it can be found that

F(N,V,T,R)=kBTln(Z(N,V,T,R))

where the partition function is

Z(N,V,T,R)=VNeE(N,V,T,r)kBTδ(𝒻(r)R)dr

For brevity, because N,V,T are all fixed for a certain ensemble, we omit them from the function arguments.

Taking the gradient of F gives:

RF(R)=kBTRln(Z(R))=kBT1Z(R)RZ(R)

Then substituting Z in for its definition gives

=kBT1Z(R)RVNeE(r)kBTδ(𝒻(r)R)dr

And the only part of the expression that depends on R is the delta function.

=kBT1Z(R)VNeE(r)kBTRδ(𝒻(r)R)dr

For any function bg(ab)=ag(ab) so Rδ(𝒻(r)R)=𝒻(r)δ(𝒻(r)R)

=kBT1Z(R)VNeE(r)kBT𝒻(r)δ(𝒻(r)R)dr

The gradient can be converted from 𝒻(r) to r using a matrix 𝐁(r) that is constrained by the Jacobian of f.

Converting the gradient with the pseudoinverse of the Jacobian

The gradient with respect to 𝒻(x) is

𝒻(r)=[𝒻1(r)𝒻2(r)𝒻m(r)]

And since the expression δ(𝒻(r)=R) only depends on r through 𝒻:

δ(𝒻(r)R)ri=j=0mδ(𝒻(r)R)𝒻j(r)𝒻j(r)ri

meaning

rδ(𝒻(r)=R)=[r1r2r3N]δ(𝒻(r)=R)=[𝒻1(r1)r1𝒻2(r1)r1𝒻m(r1)r1𝒻1(r1)r2𝒻1(r3N)r3N𝒻mr3N][𝒻1(r)𝒻2(r)𝒻m(r)]δ(𝒻(r)=R)

and is equivelent to

=𝐉𝒻𝒻(r)δ(𝒻(r)=R)

where 𝐉𝒻(r)in3Nm×3N is the Jacobian of 𝒻 at r.

Starting with

rδ(𝒻(r)=R)=𝐉𝒻(r)𝒻(r)δ(𝒻(r)=R)

For any matrix 𝐁(r)m×3N such that

𝐁(r)𝐉𝒻(r)=𝐈

Will satisfy

𝐁(r)rδ(𝒻(r)=R)=𝒻(r)δ(𝒻(r)=R)

Showing the pseudoinverse is a possible solution if m3N

The transposed pseudinverse of the Jacobian, 𝐉𝒻+(r)=(𝐉𝒻(r)𝐉𝒻(r))1𝐉𝒻(r) is one possible solution for 𝐁(r), but is not necessarily the only solution.

If we multiply both sides by 𝐉𝒻(rf)

𝐉𝒻(r)rδ(𝒻(r)=R)=𝐉𝒻(r)𝐉𝒻(r)𝒻(r)δ(𝒻(r)=R)

Now if we assume that 𝐉𝒻(r)𝐉𝒻(r) is invertable, which it is if the rows of 𝐉𝒻(r) are linearely independent (this assumption implies m3N must be true placing additional constraints on the choice of coarse graining function), then we can multiply both sides by its inverse.

𝒻(r)δ(𝒻(r)=R)=(𝐉𝒻(r𝐉𝒻(r)1𝐉𝒻(rrδ(𝒻(r)=R)

The expression on the right is simply the formula for the pseudoinverse of a matrix, (if it has more columns than rows which is true for 𝐉𝒻(r) because m3N).

A+=(AA)1A

so substituting A=𝐉𝒻(r)

(𝐉𝒻(r))+=(𝐉𝒻(r)𝐉𝒻(r))1𝐉𝒻(r)

And the pseudoinverse and transpose are commutative so we will just write (𝐉𝒻(r))+ as 𝐉𝒻+(r)

𝒻(r)δ(𝒻(r)R)=𝐉𝒻+(r)rδ(𝒻(r)R)

So after replacing the gradient we get the expression

RF(R)=kBT1Z(R)eE(r)kBT𝐁(r)rδ(𝒻(r)R)dr

Next, using the divergence theorem where the total divergence of a volume Ω is equal to the surface integral Γ. If A is a scalar function and is a vector function then

ΓA(r)(r)ndr=Ωr(A(r)(r))dr

Where n is the normal vector to the surface.

The product rule for divergence is

r(A(r)(r))=(rA(r))(r)+A(r)(r(r))

And when applied to the divergence theorem gives

ΓA(r)(r)ndr=Ω(rA(r))(r)dr+ΩA(r)(r(r)dr)

which is a generalization to integration by parts to vector fields

𝐉f+ has dimension m×3N so we can treat each row as a vector. In the following notation we apply dot products and divergences to each row of 𝐉f+, so x𝐉f+=𝐉f+x means dot producting each row which is equivelent to matrix muliplication. We substitute A(r)=δ(𝒻(r)R) and (r)=eE(r)kBT𝐁(r) into the divergence theorem.

Γδ(𝒻(r)R)eE(r)kBT𝐁(r)ndr
=Ω(rδ(𝒻(r)R))eE(r)kBT𝐁(r)dr+Ωδ(𝒻(r)R)(reE(r)kBT𝐁(r))dr

When Ω=V the boundary points xΓ will contain points on the edge of the volume. Any points not inside the volume should have an infinite energy so that eE(x)kBT should be zero (not sure about this part about eliminating the boundary term, could be wrong). The the entire left hand side of the equality should be zero.

0=Ω(rδ(𝒻(r)R))eE(r)kBT𝐁(r)dr+Ωδ(𝒻(r)R)(reE(r)kBT𝐁(r))dr

so then

Ω(rδ(𝒻(r)R))eE(r)kBT𝐁(r)dr=Ωδ(𝒻(r)R)(reE(r)kBT𝐁(r))dr

and dot product-ing each row of a matrix is just matrix multiplication so

ΩeE(r)kBT𝐁(r)(rδ(𝒻(r)R))dr=Ωδ(𝒻(r)R)(reE(r)kBT𝐁(r))dr

Then substitue into the original equation gives

RF(R)=kBT1Z(R)Ωδ(𝒻(r)R)(reE(r)kBT𝐁(r))dr

Using the divergence product rule again

=kBT1Z(R)VN(eE(r)kBT𝐁(r)rE(r)kBT+eE(r)kBTr𝐁(r))δ(𝒻(r)R)dr
=kBT1Z(R)VNeE(r)kBTδ(𝒻(r)R)(𝐁(r)rE(r)kBT+r𝐁(r))dr
=1Z(R)VNeE(r)kBTδ(𝒻(r)R)(𝐁(r)rE(r)kBTr𝐁(r))dr
=𝐁(r)rE(r)kBTkBTr𝐁(r)p(r|R)

Giving the final equation

RF(R)=𝐁(r)rE(r)p(r|R)kBTr𝐁(r)p(r|R)

Seperating energy and entropic terms

Energy term

We want to seperate the Helmholtz free energy into energy and entropic terms.

F(N,V,T,R)=U(N,V,T,R)TS(N,V,T,R)

If we leave out the fixed terms N,V,T we can write this as

F(R)=U(R)TS(R)
RF(R)=RU(R)TRS(R)

We start with the definition of U(R)

U(R)=1Z(R)VNeE(r)kBTE(r)δ(𝒻(r)R)dr

Take the gradient

RU(R)=R(1Z(R)VNeE(r)kBTE(r)δ(𝒻(r)R)dr)

Then apply the product rule

=R(1Z(R))VNeE(r)kBTE(r)δ(𝒻(r)R)dr+1Z(R)RVNeE(r)kBTE(r)δ(𝒻(r)R)dr

The product rule creates 2 terms we must calculate. Starting with the first term.

Gradient of the first term

R(1Z(R))VNeE(r)kBTE(r)δ(𝒻(r)R)dr

Applying the chain rule

=(RZ(R)Z(R)2)VNeE(r)kBTE(r)δ(𝒻(r)R)dr
=RZ(R)Z(R)1Z(R)VNeE(r)kBTE(r)δ(𝒻(r)R)dr
p

And the integral is just the internal energy by definition

=RZ(R)Z(R)U(R)

Now we just need to find RZ(R)Z(R). Expanding its definition:

1Z(R)RZ(R)=RVNeE(r)kBTδ(𝒻(r)R)dr
=1Z(R)VNeE(r)kBTRδ(𝒻(r)R)dr
=1Z(R)VNeE(r)kBT𝒻(r)δ(𝒻(r)R)dr

Converting the gradients again using the Jacobian:

=1Z(R)VNeE(r)kBT𝐁(r)rδ(𝒻(r)R)dr

And applying the same reasoning as the previous section using the divergence theorem and the divergence product rule:

=1Z(R)VN(reE(r)kBT𝐁(r))δ(𝒻(r)R)dr

Then the divergence product rule again

=1Z(R)VN(((reE(r)kBT)𝐁(r))+(eE(r)kBTr𝐁(r)))δ(𝒻(r)R)dr

Then the chain rule

=1Z(R)VN((eE(r)kBTrE(r)kBT𝐁(r))+(eE(r)kBTr𝐁(r)))δ(𝒻(r)R)dr
=1Z(R)VNeE(r)kBT((rE(r)kBT𝐁(r))+r𝐁(r))δ(𝒻(r)R)dr
=(rE(r)kBT𝐁(r))+r𝐁(r)p(r|R)
=rE(r)kBT𝐁(r)p(r|R)r𝐁(r)p(r|R)

This equation gives an expression for RZ(R). Substituting gives

RZ(R)Z(R)U(R)=U(R)rE(r)kBT𝐁(r)p(r|R)U(R)r𝐁(r)p(r|R)

This equation gives an expression for the first term.

Next we have to find the gradient of the second term.

Gradient of the second term

Starting with the term

1Z(R)RVNeE(r)kBTE(r)δ(𝒻(r)R)dr

we can move the gradient to the only part that depends on R

1Z(R)VNeE(r)kBTE(r)Rδ(𝒻(r)R)dr
1Z(R)VNeE(r)kBTE(r)𝒻(r)δ(𝒻(r)R)dr

Once again convert the gradient

1Z(R)VNeE(r)kBTE(r)𝐁(r)rδ(𝒻(r)R)dr

Once again use the divergence theorem and the divergence product rule.

1Z(R)VN(r(eE(r)kBTE(r)𝐁(r)))δ(𝒻(r)R)dr

Then the divergence product rule again

1Z(R)VN((r(eE(r)kBTE(r))𝐁(r))+((eE(r)kBTE(r))r𝐁(r)))δ(𝒻(r)R)dr

Use the product rule

1Z(R)VN((((reE(r)kBT)E(r)+eE(r)kBTrE(r))𝐁(r))+((eE(r)kBTE(r))r𝐁(r)))δ(𝒻(r)R)dr

Then the chain rule

1Z(R)VN((((eE(r)kBTrE(r)kBT)E(r)+eE(r)kBTrE(r))𝐁(r))+((eE(r)kBTE(r))r𝐁(r)))δ(𝒻(r)R)dr

Factor out the exponential

1Z(R)VNeE(r)kBT((rE(r)kBTE(r)𝐁(r))+((rE(r))𝐁(r))+(E(r)r𝐁(r)))δ(𝒻(r)R)dr
rE(r)kBTE(r)𝐁(r)p(r|R)+(rE(r))𝐁(r)p(r|R)+E(r)r𝐁(r)p(r|R)

This equation gives an expression for the second term.

Now we just add the expressions for the 2 terms together

RU(R)=
U(R)rE(r)kBT𝐁(r)p(r|R)U(R)r𝐁(r)p(r|R)+
rE(r)kBTE(r)𝐁(r)p(r|R)+(rE(r))𝐁(r)p(r|R)+E(r)r𝐁(r)p(r|R)

We can group some terms

=E(r)U(R)kBTrE(r)𝐁(r)p(r|R)+(E(r)U(R))r𝐁(r)p(r|R)+(rE(r))𝐁(r)p(r|R)

So we have a final expression of

RU(R)=E(r)U(R)kBT𝐁(r)rE(r)p(r|R)+(E(r)U(R))r𝐁(r)p(r|R)+𝐁(r)rE(r)p(r|R)

Entropic term

Starting with the definition of entropy

TS(R)=kBT1Z(R)VNeE(r)kBTln(1Z(R)eE(r)kBT)δ(𝒻(r)R)dr

Using basic algebraic manupulation

=kBTZ(R)VNeE(r)kBT(ln(1Z(R))+ln(eE(r)kBT))δ(𝒻(r)R)dr
=kBTZ(R)VNeE(r)kBT(ln(Z(R))E(r)kBT)δ(𝒻(r)R)dr
=1Z(R)VNeE(r)kBT(kBTln(Z(R))+E(r))δ(𝒻(r)R)dr
=(kBTln(Z(R)))1Z(R)VNeE(r)kBTδ(𝒻(r)+R)dr1Z(R)VNeE(r)kBTE(r)δ(𝒻(r)R)dr
=(kBTln(Z(R)))1Z(R)Z(R)+U(R)
=(kBTln(Z(R)))+U(R)
=F(R)+U(R)

So this is consistent that

TS(R)=U(R)F(R)

Then

TRS(R)=RU(R)RF(R)

and we can substitute

=E(r)U(R)kBT𝐁(r)rE(r)p(r|R)+(E(r)U(R))r𝐁(r)p(r|R)+𝐁(r)rE(r)p(r|R)
(𝐁(r)rE(r)p(r|R)kBTr𝐁(r)p(r|R))

and one of the terms cancels out with a final expression of

TRS(R)=E(r)U(R)kBT𝐁(r)rE(r)p(r|R)+(E(r)U(R))r𝐁(r)p(r|R)+kBTr𝐁(r)p(r|R)

Isothermal-Isobaric Ensemble (not sure about this part)

We must first derive some properties about the Isothermal-Isobaric ensemble, or NPT ensemble. Starting with the NVT partition function, where V3 is the set of all points in the volume

Zsys(N,V,T,R)=VNeE(rkBTδ(𝒻(r)R)dr

And if the heat bath is considered to be an ideal gas then

Zbath(Nbath,Vbath,T)=VbathNbathds=|Vbath|Nbath

which is equivelent to the total volume to the power of the number of particles

|Vbath|Nbath
=|VtotalVsys|NtotalNsys

If we assume the particles and the heat bath don't share volume then

=(|Vtotal||Vsys|)NtotalNsys
=|Vtotal|Nbath(1|Vsys||Vtotal|)Nbath

Using the natural gas law

P|Vbath|=NbathkBT

We can find a formula for the number of particles in the bath

Nbath=P|Vbath|kBT

and substituting

=|Vtotal|Nbath(1|Vsys||Vtotal|)P|Vbath|kBT

In the limit that |Vbath| and approximating |Vbath||Vtotal|

lim|Vbath|(1|Vsys||Vbath|)P|Vbath|kBT=eP|Vsys|kBT

So we get an expression for the partition function of the heat bath:

Zbath(N,P,T,V)=|Vtotal|NtotaleP|V|kBT

The combined partition function is

Zsys+bath(N,P,T,V,R)=Zbath(N,P,T,V)Zsys(N,V,T,R)

Consider the set V(l)3N of possible points is parameterized by some variables lk, for example the lengths of a box. Also consider a coarse graining function on the parameters of the volume kq where the coarse grained parameters are Lq

Compared to the canonical ensemble the partition function is Δ(N,P,T,R,L).

Δ(N,P,T,R,L)=kZsys+bath(N,P,T,V(l),R)δ((l)L)dl
=kZbath(N,P,T,V(l))Zsys(N,V(l),R,R)δ((l)L)dl
=keP|V(l)|kBTZ(N,V(l),T,R)δ((l)L)dl
=keP|V(L)|VNeE(r,V(L))kBTδ(𝒻(r,L)=R))drdL
=kVNeE(r,V(L))+P|V(L)|kBTδ(𝒻(r,L)=R))drdL

And it can be shown that the Gibbs free energy

H(N,P,T,R,L)=U(N,P,T,R,L)TS(N,P,T,R,L)+PV(l)p(l|L,N,P,T)

And for brevity will be written as

H(R,L)=U(R,L)TS(R,L)+PV(L)p(l|L)

and where the microstates are (r,l)3N×k and there are 2 mapping function R=𝒻(r),L=(l). The Gibbs free energy can be found to be:

H(R,L)=kBTln(Δ(R,L))

Then to calculate the gradient

RH(R,L)=kBTRln(Δ(R,L))

Chain rule

=kBT1Δ(R,L)RΔ(R,L)

Put in the definition

=kBT1Δ(R,L)RkeP|V(l)|kBTZ(V(l),R)δ((l)L)dl

Move the graident inside

=kBT1Δ(R,L)keP|V(l)|kBTRZ(V(l),R)δ((l)L)dl

We can make RZ(V,R) a function of RF(V,R)

F(V,R)=kBTln(Z(V,R))

take the gradient of boths sides

RF(V,R)=kBTRln(Z(V,R))

chain rule

RF(V,R)=kBTRZ(V,R)Z(V,R)

then algebraic manipulation

RZ(V,R)=Z(V,R)kBTRF(V,R)

substituting for RZ(V,R) gives

=kBT1Δ(R,L)keP|V(l)|kBTZ(V,R)kBTRF(V(l),R)δ((l)L)dl

cancel out the denominator

=1Δ(R,L)keP|V(l)|kBTZ(V,R)RF(V(l),R)δ((l)L)dl

and this is simply the weighted average based off the parameters of the volume, giving a final expression of

RH(L,R)=RF(V(l),R)p(l|L)

Move gradient


We can also calculate the macrostate "pressure"

LH(R,L)=kBTRln(Δ(R,L))

Chain rule

=kBT1Δ(R,L)LΔ(R,L)

Put in the definition

=kBT1Δ(R,L)LkeP|V(l)|kBTZ(V(l),R)δ((l)L)dl
=kBT1Δ(R,L)keP|V(l)|kBTZ(V(l),R)Lδ((l)L)dl
=kBT1Δ(R,L)keP|V(l)|kBTZ(V(l),R)(l)δ((l)L)dl
=kBT1Δ(R,L)keP|V(l)|kBTZ(V(l),R)𝐂(l)lδ((l)L)dl

and using the divergence theorem again with the argument the surface integral is zero because the pressure of the bath as the volume of the system becomes infinitely big becomes infinitely large as the system takes up all the volume

=kBT1Δ(R,L)k(leP|V(l)|kBTZ(V(l),R)𝐂(l))δ((l)L)dl
=kBT1Δ(R,L)k((l(eP|V(l)|kBTZ(V(l),R))𝐂(l))+((eP|V(l)|kBTZ(V(l),R))l𝐂(l)))δ((l)L)dl
=kBT1Δ(R,L)k(l(eP|V(l)|kBTZ(V(l),R))𝐂(l))δ((l)L)dlkBTl𝐂(l)p(l|L)
=kBT1Δ(R,L)k(((leP|V(l)|kBT)Z(V(l),R))+(eP|V(l)|kBTlZ(V(l),R)))𝐂(l))δ((l)L)dlkBTl𝐂(l)p(l|L)

Using the same substitution for LZ(V,R) gives

=kBT1Δ(R,L)k(((leP|V(l)|kBT)Z(V(l),R))+(eP|V(l)|kBT(Z(V,R)kBTRF(V,R))))𝐂(l))δ((l)L)dlkBTl𝐂(l)p(l|L)
=kBT1Δ(R,L)k(leP|V(l)|kBT)Z(V(l),R)𝐂(l)δ((l)L)dl+LF(V,R)𝐂(r)p(l|L)kBTl𝐂(l)p(l|L)

chain rule

=kBT1Δ(R,L)k(eP|V(l)|kBTlP|V(l)|kBT)Z(V(l),R)𝐂(l)δ((l)L)dl+LF(V,R)𝐂(r)p(l|L)kBTl𝐂(l)p(l|L)
p
=Pl|V(l)|𝐂(r)p(l|L)+LF(V,R)𝐂(r)p(l|L)kBTl𝐂(l)p(l|L)

Giving a final expression of

LH(R,L)=𝐂(r)Pl|V(l)|p(l|L)+𝐂(r)LF(V,R)p(l|L)kBTl𝐂(l)p(l|L)

Seperating energy, entropic, and pressure contribution

TODO