Wednesday, May 13, 2015

Long time without, but quick update- sometimes, the old tools are still the best

I was trying to get a list of users currently logged in to machines on a remote network tonight, when it hit me that I could do it pretty easily with powershell- some quick google searches showed up telling me that, yes, in fact I could. Along with these hideously long, convoluted scripts.

Now don't get me wrong- I'm all for error checking out the yang, but do I really need 400 lines when one will do? And do I really need a specific module for getting my answer?

Turns out, no. I just needed the built in tools in windows. Specifically the ad extensions for powershell, and qwinsta. What's that, you've never heard of qwinsta? That's ok, I only learned of it back in the days of NT4, along with it's sibling rwinsta. Why such odd names? Well, in those days we were pretty limited (remember, 8.3 for most everything microsoft related), so qwinsta (or Query Windows Station) and rwinsta (Reset Windows Station) are used to do just that.

Now on to the meat-

Quick check of the network as follows:

Import-Module ActiveDirectory -PSSession $s
$complist = Get-ADComputer -filter *
foreach ( $comp in $complist) { echo $comp.name ; qwinsta -Server $comp.name }

And there ya go, every computer AD knows about gets scanned.

Wednesday, December 3, 2014

VNXe- proof of how to do storage configuration wrong... very wrong.

~This article was originally to be published in February of 2014~


For the last few weeks, I've been dealing with an environment where storage has been massively misconfigured. Dual VNXe 3100s, only one in use. Storage network running on a single, incredibly slow HP 1410 switch, single NIC, single path... pretty much everything you can do wrong and still have it work.

So I've been trying to fix this in order to get backups to run in a reasonable amount of time- added dedicated switching, trying to get the second VNXe available for access to the local vmware cluster, etc. The first thing I run into is how incredibly slow this system is- any thing I do literally takes 160 seconds to complete. How do I know? I sat there with a stopwatch and clocked several of my actions, because I couldn't believe how slow it seemed- I must have been spoiled! I found to my horror that it's literally 2 minutes, 40 seconds per action. Dear god I hate this thing.

Going over the vmhosts, I come to find that there isn't consistent datastore mappings- each host has most datastores in common, but there were several that were only on one host, or two hosts. So I spend some time correcting this when I come across another horrible discovery- there were duplicate entries for hosts, and some were flat out wrong (or in one case, completely identical!)

This is where I made the critical mistake- I had gotten the switching in place, I had gotten the SANs moved over, but I hadn't cleaned up the access lists. I started on that, with the rational thought that it would work in a sensible manner. In my mind, I would simply change access over from IP allow lists to IQNs (simpler to manage, right?)- oh so wrong.

This thing had been setup to talk to vCenter, and so talk it did. For some ungodly reason, it decided that since vCenter knew about the datastore associated with the LUN that it could not change the access method (IQN vs IP)- and throws up this beautiful error:

"The changes could not be applied the following error was encountered:
 The datastore name is already in use on the ESX server
error code: 0x600d50"

WTF? Why on earth would this matter? Proper MPIO would allow multiple connections to the same LUN from the same initiator without error, so why on earth would this matter in the least? So, being the trusting, happy go-lucky admin that I am, I click OK.

And developed an ulcer. That instant.

Two of the three hosts were kicked off the VNXe immediately.

Ok I think to myself, easy fix. I'll just go back in and give myself permission again, undoing my changes.

Oh no. no no no- it's not that easy. It's nowhere near that easy. Nothing I do is letting me reattach these LUNs. _Nothing_. Now I'm getting spooked- is the data even still there? Did I somehow just obliterate the customer's data? I dig through emails and documents, finding the credentials I need to get on EMCs support site, where I run into the first hurdle.

Error code 0x600d50 is apparently not something customers need know about, so if you get that error, you're pretty well screwed. It also doesn't help that every reference to it is in regards to renaming datastores on the vmware side without doing it on the storage side- apparently does bad things. But this doesn't concern me, right? I made no such change!

There's apparently a lot more to this error than one condition- but it's so piss poorly documented that one will never find out. Now I'm really panicked, so I click on the chat with support button. I fill in all the details, and even manage to find the serial number for the device I'm having a headache with- and then click "submit".

And promptly get told support's not available via web chat.

ok...

Calling in to EMC's support line, I get told immediately by a very friendly recording that I'll get quicker support... if I use the chat client. Yep, this was going to be one of those calls. After navigating the menu system, I get to a young man who is completely lost by the gibberish coming out of my mouth- but he does get me in touch with a woman who understands me perfectly (I wish I could say the same about what she was saying)- I managed to secure a call back promise from her.

I wish I could say the nightmare ends here. The customer is down, and I've notified them that I'm working on it. They're mostly ok with it, as I'm working on it. At this point I'm pretty freaked, as I'm waiting on a call back that I'm not even sure is coming. 30 minutes later, the call back finally happens. And I walk the tech through what's going on, and he immediately starts trying to do all the things I had done.

Which is about when I really start to shiver- the tech was actually expecting it to work too. He gets the senior tech involved, which I overhear in the background saying that I need to remove all the datastores from the VM hosts.

Can't do it-  vmware refuses to unmount the datastores as long as there's a vm on it. So after arguing with support and realizing I'm not going to get anywhere otherwise, I power off every vm. and rescan the HBAs...

Which changes nothing.
At all.

And now I'm freaked because the support techs want me to remove the VMs from inventory. Now that I can't see the datastores in order to record what VM's go where (why is this a problem you ask? because I just took over the environment!)

I flat out refuse- and we go onto the next step. Which involves resetting the SAN. Needless to say, I refused to do that too.

What did eventually work you ask? Going into the VNXe where the problem originally existed and removing every host entry. Letting the SAN sit and fiddle with itself for awhile, while rescanning the HBA on the hosts. This at least cleared out the datastore lists- not a reassuring thing in the least mind you. Now I re-added the hosts one by one until I had them back in using the access methods I wanted them to use.

And I attached the first LUN.

And waited forever, or so it seemed. This was when I decided to use a stopwatch to figure out how long this was taking. Each LUN I reattach is taking ~3 minutes per LUN. 44 LUNs.

44 LUNs at 3 minutes per. 132 minutes to reattach these datastores. 132 minutes before I can even attempt to get the customer back online. 132 minutes of mind numbing, nail biting, customer frustrating hell.

So, I ask... Is there some way I can do these in bulk? "Nope"

I finally managed to thank the tech for his time and get off the phone. Where upon I've been stewing for over 2 hours, thinking about how utterly stupid this is. Fuming that I never had issues like this with Equallogic, HUS, or even open source linux iSCSI targets.

Why had I never run into these problems? Because none of those care one iota about what's accessing the volume. None of them try to do any screwed up LUN per datastore mapping, nor trying to enforce single host access or otherwise- why? Because they assume the storage admin knows what he's doing it.

Thursday, March 6, 2014

The horrid beast known as "vCloud"

So once again, I've been tasked with doing something that normally can be done just by going into the viclient or the webclient- namely, modifying disks for a vm. Only... this is in vcloud. The horrid nightmare that it is.

So we start off by logging in and getting auth. To do this, I've been using curl. I'm sure there's another way, possibly a way to do this using IIS/apache, what have you, but it's far outside the scope of what I'm doing at the moment.

curl -i -k -H "Accept:application/*+xml;version=1.5" -u <user>@system:<password> -X POST https://<vcloudHost>/api/sessions

This will return a string you will need for the rest of your transactions:

HTTP/1.1 200 OK
Date: Thu, 06 Mar 2014 16:32:03 GMT
x-vcloud-authorization: HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=
Set-Cookie: vcloud-token=HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=; Secure; Path=/
Content-Type: application/vnd.vmware.vcloud.session+xml;version=1.5
Date: Thu, 06 Mar 2014 16:32:03 GMT
Content-Length: 980

<?xml version="1.0" encoding="UTF-8"?>
<Session xmlns="http://www.vmware.com/vcloud/v1.5" user="<user>" org="System" type="application/vnd.vmware.vcloud.session+xml" href="https://<vcloudHost>/api/session/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5 http://<vcloudHost>/api/v1.5/schema/master.xsd">
    <Link rel="down" type="application/vnd.vmware.vcloud.orgList+xml" href="https://<vcloudHost>/api/org/"/>
    <Link rel="down" type="application/vnd.vmware.admin.vcloud+xml" href="https://<vcloudHost>/api/admin/"/>
    <Link rel="down" type="application/vnd.vmware.admin.vmwExtension+xml" href="https://<vcloudHost>/api/admin/extension"/>
    <Link rel="down" type="application/vnd.vmware.vcloud.query.queryList+xml" href="https://<vcloudHost>/api/query"/>
    <Link rel="entityResolver" type="application/vnd.vmware.vcloud.entity+xml" href="https://<vcloudHost>/api/entity/"/>
</Session>

Notice the line that starts off with "x-vcloud-authorization"? You need the string following it.

Now to get a list of VM's

 curl -i -k -H "Accept:application/*+xml;version=1.5" -H "x-vcloud-authorization: HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=" -X GET 'https://<vcloudHost>/api/query?type=adminVM&fields=name,datastoreName'

This will return a list of machines:

HTTP/1.1 200 OK
Date: Thu, 06 Mar 2014 16:32:26 GMT
Content-Type: application/*+xml;version=1.5
Date: Thu, 06 Mar 2014 16:32:26 GMT
Content-Length: 1911

<?xml version="1.0" encoding="UTF-8"?>
<QueryResultRecords xmlns="http://www.vmware.com/vcloud/v1.5" total="7" pageSize="25" page="1" name="adminVM" type="application/vnd.vmware.vcloud.query.records+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=records&amp;fields=name,datastoreName" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5 http://<vcloudHost>/api/v1.5/schema/master.xsd">
    <Link rel="alternate" type="application/vnd.vmware.vcloud.query.references+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=references&amp;fields=name,datastoreName"/>
    <Link rel="alternate" type="application/vnd.vmware.vcloud.query.idrecords+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=idrecords&amp;fields=name,datastoreName"/>
    <AdminVMRecord name="ubieFS3" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-0ccbd815-101c-4f3f-bc53-a482dd977e57"/>
    <AdminVMRecord name="ubieTS1" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-1dac5547-1764-4fff-a2f9-feca10629d3b"/>
    <AdminVMRecord name="ubieAPP1" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-2f1de9cf-1d77-4c6c-b454-58fdce96ceed"/>
    <AdminVMRecord name="ubieSQL1" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-58f1c7a3-7bd2-45e5-80c8-fb84674aabe4"/>
    <AdminVMRecord name="ubieDC11" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-83cdd93a-ee48-4846-a27c-1919ade3bf9c"/>
    <AdminVMRecord name="ubieEMAIL1" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4"/>
    <AdminVMRecord name="ubieDC12" datastoreName="HUS-3" href="https://<vcloudHost>/api/vApp/vm-dc5f8f09-072e-4717-94ed-d459ec566992"/>
</QueryResultRecords>

Now, we want the specifics on one VM:

 curl -i -k -H "Accept:application/*+xml;version=1.5" -H "x-vcloud-authorization: HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=" -X GET 'https://<vcloudHost>/api/query?type=adminVM&filter=(name==ubieEMAIL1)'
HTTP/1.1 200 OK
Date: Thu, 06 Mar 2014 16:47:03 GMT
Content-Type: application/*+xml;version=1.5
Date: Thu, 06 Mar 2014 16:47:03 GMT
Content-Length: 1833

<?xml version="1.0" encoding="UTF-8"?>
<QueryResultRecords xmlns="http://www.vmware.com/vcloud/v1.5" total="1" pageSize="25" page="1" name="adminVM" type="application/vnd.vmware.vcloud.query.records+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=records&amp;filter=(name==ubieEMAIL1)" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5 http://<vcloudHost>/api/v1.5/schema/master.xsd">
    <Link rel="alternate" type="application/vnd.vmware.vcloud.query.references+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=references&amp;filter=(name==ubieEMAIL1)"/>
    <Link rel="alternate" type="application/vnd.vmware.vcloud.query.idrecords+xml" href="https://<vcloudHost>/api/query?type=adminVM&amp;page=1&amp;pageSize=25&amp;format=idrecords&amp;filter=(name==ubieEMAIL1)"/>
    <AdminVMRecord vmToolsVersion="8389" vdc="https://<vcloudHost>/api/vdc/099a3580-ee14-4262-8eb5-eb0586786b58" vc="https://<vcloudHost>/api/admin/extension/vimServer/7c443115-8d45-42f0-b2f0-86d255d0e552" status="POWERED_ON" org="https://<vcloudHost>/api/org/dcd46410-3dee-47e8-a47f-e2bb99eb6cc7" numberOfCpus="2" networkName="ubie Org Ext" name="ubieEMAIL1" moref="vm-161" memoryMB="3072" isVdcEnabled="true" isVAppTemplate="false" isPublished="false" isDeployed="true" isDeleted="false" hostName="<clusterMember>" hardwareVersion="8" guestOs="Microsoft Windows Server 2008 R2 (64-bit)" datastoreName="HUS-3" containerName="ubieEMAIL1" container="https://<vcloudHost>/api/vApp/vapp-03219d73-4fe2-406b-8d32-85121f773a6a" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4" pvdcHighestSupportedHardwareVersion="8" containerStatus="RESOLVED"/>

Now that last section is what we need, specifically the "https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4". We're going to use that to pull data on the disk layout.

curl -i -k -H "Accept:application/*+xml;version=1.5" -H "x-vcloud-authorization: HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=" -X GET 'https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks'

HTTP/1.1 200 OK
Date: Thu, 06 Mar 2014 16:49:28 GMT
Content-Type: application/vnd.vmware.vcloud.rasditemslist+xml;version=1.5
Date: Thu, 06 Mar 2014 16:49:28 GMT
Content-Length: 2018

<?xml version="1.0" encoding="UTF-8"?>
<RasdItemsList xmlns="http://www.vmware.com/vcloud/v1.5" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" type="application/vnd.vmware.vcloud.rasdItemsList+xml" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5 http://<vcloudHost>/api/v1.5/schema/master.xsd http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2.22.0/CIM_ResourceAllocationSettingData.xsd">
    <Link rel="edit" type="application/vnd.vmware.vcloud.rasdItemsList+xml" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks"/>
    <Item>
        <rasd:Address>0</rasd:Address>
        <rasd:Description>SCSI Controller</rasd:Description>
        <rasd:ElementName>SCSI Controller 0</rasd:ElementName>
        <rasd:InstanceID>2</rasd:InstanceID>
        <rasd:ResourceSubType>lsilogicsas</rasd:ResourceSubType>
        <rasd:ResourceType>6</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:AddressOnParent>0</rasd:AddressOnParent>
        <rasd:Description>Hard disk</rasd:Description>
        <rasd:ElementName>Hard disk 1</rasd:ElementName>
        <rasd:HostResource xmlns:vcloud="http://www.vmware.com/vcloud/v1.5" vcloud:capacity="40960" vcloud:busSubType="lsilogicsas" vcloud:busType="6"></rasd:HostResource>
        <rasd:InstanceID>2000</rasd:InstanceID>
        <rasd:Parent>2</rasd:Parent>
        <rasd:ResourceType>17</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:Address>0</rasd:Address>
        <rasd:Description>IDE Controller</rasd:Description>
        <rasd:ElementName>IDE Controller 0</rasd:ElementName>
        <rasd:InstanceID>3</rasd:InstanceID>
        <rasd:ResourceType>5</rasd:ResourceType>
    </Item>
</RasdItemsList>

Now we're going to add a couple disks- this is where it gets weird. You're going to need to create an XML response file from the above configuration info, and you'll need to keep all the current info as well. Failure to do so can completely destroy your VM! You've been warned.

create your response file:

<?xml version="1.0" encoding="UTF-8"?>
<RasdItemsList xmlns="http://www.vmware.com/vcloud/v1.5" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" type="application/vnd.vmware.vcloud.rasdItemsList+xml" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vmware.com/vcloud/v1.5 http://<vcloudHost>/api/v1.5/schema/master.xsd http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2.22.0/CIM_ResourceAllocationSettingData.xsd">
    <Link rel="edit" type="application/vnd.vmware.vcloud.rasdItemsList+xml" href="https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks"/>
    <Item>
        <rasd:Address>0</rasd:Address>
        <rasd:Description>SCSI Controller</rasd:Description>
        <rasd:ElementName>SCSI Controller 0</rasd:ElementName>
        <rasd:InstanceID>2</rasd:InstanceID>
        <rasd:ResourceSubType>lsilogicsas</rasd:ResourceSubType>
        <rasd:ResourceType>6</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:AddressOnParent>0</rasd:AddressOnParent>
        <rasd:Description>Hard disk</rasd:Description>
        <rasd:ElementName>Hard disk 1</rasd:ElementName>
        <rasd:HostResource xmlns:vcloud="http://www.vmware.com/vcloud/v1.5" vcloud:capacity="40960" vcloud:busSubType="lsilogicsas" vcloud:busType="6"></rasd:HostResource>
        <rasd:InstanceID>2000</rasd:InstanceID>
        <rasd:Parent>2</rasd:Parent>
        <rasd:ResourceType>17</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:AddressOnParent>1</rasd:AddressOnParent>
        <rasd:Description>Hard disk</rasd:Description>
        <rasd:ElementName>Hard disk 2</rasd:ElementName>
        <rasd:HostResource xmlns:vcloud="http://www.vmware.com/vcloud/v1.5" vcloud:capacity="153600" vcloud:busSubType="lsilogicsas" vcloud:busType="6"></rasd:HostResource>
        <rasd:InstanceID>2001</rasd:InstanceID>
        <rasd:Parent>2</rasd:Parent>
        <rasd:ResourceType>17</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:AddressOnParent>2</rasd:AddressOnParent>
        <rasd:Description>Hard disk</rasd:Description>
        <rasd:ElementName>Hard disk 3</rasd:ElementName>
        <rasd:HostResource xmlns:vcloud="http://www.vmware.com/vcloud/v1.5" vcloud:capacity="102400" vcloud:busSubType="lsilogicsas" vcloud:busType="6"></rasd:HostResource>
        <rasd:InstanceID>2002</rasd:InstanceID>
        <rasd:Parent>2</rasd:Parent>
        <rasd:ResourceType>17</rasd:ResourceType>
    </Item>
    <Item>
        <rasd:Address>0</rasd:Address>
        <rasd:Description>IDE Controller</rasd:Description>
        <rasd:ElementName>IDE Controller 0</rasd:ElementName>
        <rasd:InstanceID>3</rasd:InstanceID>
        <rasd:ResourceType>5</rasd:ResourceType>
    </Item>
</RasdItemsList>

Now, provided you have enough room, you'll run the following:


curl -i -k -H "Accept:application/*+xml;version=1.5" -H "x-vcloud-authorization: HeJTbAEggL97iqdVsqMiINFhu4ZIRsYlPRd96dipjvc=" -X PUT 'https://<vcloudHost>/api/vApp/vm-af966c82-e9e9-4f39-a7f3-21fbf9560ed4/virtualHardwareSection/disks' -H “Content-Type: application/vnd.vmware.vcloud.rasdItemsList+xml” -d @create-disk

New disks created. Why on earth this is so miserable I have no idea, but it really shouldn't be.


Tuesday, January 28, 2014

Holy crap true believers!

And yes, stan the man would probably frown on that- but still.

So it's been the usual 4-5 months (or more) since my last update... talk about the more things change...

I've taken a job in Houston Texas to work for an MSP that I've known for awhile. In the two months that I've been here, I've learned quite a bit about tech that I've only wished I could work with. For instance- the HUS series of SANs from Hitachi. Beautiful boxes, easy to physically install... and incredibly easy to configure. I never thought I'd find something as simple as iscsi-target, but I was definitely wrong.  Add on top of that, a crash course in hacking vCloud. That's right, today I get to migrate VM's under vCloud using the REST API's. Tons more details at virtuallyghetto, but the highlight is- make sure your datastores are visible to vCloud.

Inside vCloud, you have to do all your management using vCloud Director- a web based headache, but it has it's purposes. To add your new datastores, first you have to add them the usual way using vmware's viclient or via the web console- either will work fine. Next, you have to log in via the web to your vCloud Director- not fun if you've not done this before.

You'll have to login as a system administrator for your vCloud, and navigate to "manage & monitor", find "Provider vDCs" and then select the right provider. In here, you'll add your datastore.

Not quite brain surgery, but nerve wracking when you're convinced it's going to explode on you at a moment's notice.

Saturday, August 24, 2013

VMWare DataCenters and My stupidity == learning new trick

So I decided in a demo environment that I would try to move an ADDC from one datacenter to another- without realizing that you can't. Oh sure, there's a slightly convoluted way of doing so via off-lining the VM, and doing a cold migration between datacenters, but as I found out very quickly- that can be filled with some very nasty gotchas.

Some background on this VM- it's just the test auth for a proof of concept (POC) system, so it does double duty as the RAS host as well. VPN in, do my work, happy clam. Note- this box is the VPN server.

Did I mention it's the VPN server yet? Very important point, and one that makes me damn glad I had an out of bandwidth management setup.

Anyway, I start the preparations to move this VM by... shutting down the VM. Makes sense, it's a cold migration right? Yeah, guess who winds up disconnected? So without a hint of panic, I try to login to the host itself, thinking that AD going offline just popped VCenter. Can't reach the host- very quickly realize I can't reach any of the hosts. Now the panic briefly pops in, however it doesn't last long as I realize I can just log into the console.

And discover another mild gotcha.

The esxi shell is disabled by default.

Mind you, I've not played with 5.1 under the hood that much, so a quick tour of google and I find out how to access the shell once again- F2 at the console, login and navigate to "Troubleshooting Mode Options", enable the shell, and exit out. Finally, I've got shell!

So I login, execute quick vim-cmd vmsvc/getallvms in order to locate the vmid of the vm in question, followed by vim-cmd vmsvc/power.on . No real panic as yet, but that's because I figured I'm already fired anyway, how much worse can it get? Less than 5 minutes later, I'm able to login to the VPN and restart my connection to VCenter.

When I notice the error logs make no mention of VCenter ever having an issue. The only issue on the whole system was the fact that the esxi shell had been enabled on one of the hosts. Then it dawns on me- my original thought of short downtime not having any effect on the network was correct, however in my raging stupidity I'd forgotten where the vpn lived. That issue gets fixed today.

However, in looking around, I had an "AH HA!" moment that I needed to test (in my homelab... even if it's a POC, it's not for my tests- just the customer's). What if I join the new host to the current datacenter, do a live migration to the new host, remove the host and rejoin it to the correct datacenter? One quick test with a host running one half of my local AD (Yes, in the home test lab I run two dc's minimum... I hate rebuilding them.That'll teach me to cut corners).

Removing the host from the datacenter does not delete the VMs on that host. Joining a host with VMs doesn't delete the VMs (but I already knew that)- So what's to keep this from working? As near as I can tell nothing- just make sure the VMs you are trying to move aren't in a cluster, as the cluster will more than likely try to bring them back after you've removed the host.

In conclusion- if you can remove and join hosts from the vmware datacenter, you can do a live migration of VMs between datacenters.It just takes a bit of forethought and planning.

And being smart enough to remember where your VPN lives.

Wednesday, July 31, 2013

It's been awhile again, I know. Bad poster!

And man have I been busy! The list of items is monstrous indeed- everything from vmware 5.1, server 2012 to NFS. I've been learning user administration (again! seems it changes every few years, go figure), more powershell, more exchange, and getting to play with great toys, like synology's RS2212+ (it's a thing of beauty, and I will be writing about it, and possibly buying one for the house), as well as redeveloping mass vmware deployments of server 2008 (I know, old os, but I had to deploy over 160 VMs in a single weekend)

I'll be detailing everything over the coming weeks, provided I don't get swamped again.

Monday, October 8, 2012

ESEUTIL, Unitrends UEB and my own stupidity

So I just recently got a copy of the Unitrends Enterprise Backup appliance for vmware installed and licensed over the weekend. Now me being me, I just had to try and dive in head first. Couldn't get a lot of things working, but I did get exchange backups working- which is a massive plus in my book.

So, verified the backup, and it cleared my exchange logs- yay! Talk about making me happy. Until a nagging voice at the back of my mind reminded me of recent experiences with backup exec (and the fact that there's no corresponding restore exec). A restore was needed! So I went through the process- but I wanted to restore only a single mailbox. Talk about a nail biter- as I'm going through the options, I created the restore share by selecting my mail db backup, selecting the option of "Next (Select Files/Items)" and...
Waiting. 16GB db plus about 30GB in log files- takes awhile. Cool thing is, this restores to a local samba share, which you then access from your exchange box.

Which is where the stupid on my part comes in- you'll see it soon.
So, I check the db and start trying to bring it to a clean shutdown state- (for those of you who don't know how to do this, check out this blog: ExchangeServerPro, excellent write up!) when I started running into trouble- my db checked the way I expected it to, my log files were good, however during the recovery phase...

Operation terminated with error -1032 (JET_errFileAccessDenied, Cannot access file, the file is locked or in use)

And it doesn't tell me which file. Of course, since I'm such a genius, I decide that it must mean the db is locked... nope, db's not locked.

Oh! right, the samba share is read-only! Why didn't I realize this sooner?
So I move the db to a directory I've got read/write access to, and it's all good, right?

Oh hell no.

Operation terminated with error -1032 (JET_errFileAccessDenied, Cannot access fi le, the file is locked or in use) exchange 2010

Now, I get a bit irritated and throw handle and process explorer (both from sysinternals, good stuff!) only to find out that db is never locked. In fact, the restore process gets to ~90% before failing without ever opening a log file or the db!

WTF?

Well, turns out that the log directory needs to be read/write accessible too... something I would've figured out almost 10 minutes sooner had I just checked the event logs for the ESE errors...

So, while the samba mount is cool, it's not as useful as I had hoped. But it does make it easier to get at the files and use the standard microsoft tools for manipulating the files (powershell, robocopy, xcopy and all the rest)

Problem solved, files copying, face red with shame.
Hopefully this sheds a bit of light on the problem in case anyone else is as dense as I can be.