actually my main problem is until now i dont quite get the appeal of this cloud thing. especially for hpc.
other than startups. i imagine everybody wants a fixed budget. i do not get pay as u go in the commercial context. i totally rather take a monthly vps than a thing that bills me by cpu cycles and bytes of traffic and disk used. because then i can write down in a xlsx how much it will cost this year and next year.
for hpc its even weirder. the whole point of using pbs or slurm is i got limited resources and infinite wants thats why people stand in queue right? if u can launch shit on demand then there is no need for a scheduler or a queue, just go ahead and start as many vm on aws,gcp,azure as u like? to be more specific.
u spin up 20 vm and directly mpirun on all ur 20 nodes! u dont qsub or sbatch and then wait for ur turn..
i mean. for startups and poc i get it. spin up something quick. especially for startups. if load comes in, u r rich. if it doesnt work u can tear it down.
but for other people i imagine. exactly i need capacity planning , i need to know how much money to ask for next year, i need to minimise the entire lifetime cost of the thing. like over 5-10 years.
at least thats what i thought.
and i dont get using commercial software or proprietary solutions. like if u did ur infra in google appengine. how on earth are u going to migrate to another vendor?
if u use slurm. u want to install where. how many nodes. u just do it.
if u use pbspro or..yeah..parallelworks... now u have to count licenses? and every N years discuss renewal?
or. if your firewall is iptables. or firewalld. u can copy that stuff anywhere.
if ur firewall is fortigate. how to migrate to palo alto?
the only angle i can see. is setting up a datacenter is capex. and cloud spend is opex. and finance might prefer opex...
yes there are all sorts of in-between. from ur own dc to colo to dedicated servers to vps to..stuff that bill u by cpu cycles and bytes transferred or something...
1 thing i heard so far that makes some sense is "cloud bursting"
i can see it. for 350 days of the year ur company just needs the barest minimal of resources. but dunno black friday or cyber monday or 1111 day suddenly u need 1000x the capacity. basically u do all ur business for the year in just 2 weeks. i guess that works.
but why hpc also cloud bursting?
again. if u have the money. go ahead and run it concurrently on 1000 ec2 instances already.
we wouldnt have needed to setup the cluster in the first place. and make everyone queue their jobs.
and its even weirder to see governments talk about cloud. these ppl run on fixed budgets right? how can write down don't know next year IT spend is how much?
and the problem is. if u have a budget. and then if u can accumulate them for 5yrs. u can buy hardware. and you can run simple applications on 100 cores and 2tb ram. instead of figuring out scaling out/clustering/raft consensus/metadata servers/blah...
yes its probably .95 not .995...but u get a stack that is much easier to maintain and troubleshoot.
why then is this hosted on cf workers? because its webhosting, which predates this cloud thing. because its free. and because i dont care about migrating..if its gone so be it.
i mean. i cannot imagine it. why anyone need to scale beyond a single 4u with 60 hdds 8 nvme ssds 100 cores and 2tb ram. if they not fortune 500 or faang. or at least hosting some kind fo popular service online. or providing dbs online banking or running the backend of paynow or something. we should be talking about 1000s. if not even more. of concurrent users. on such a machine.
scaling up, with backups. instead of scaling out. means no distributed filesystems. no clustered databases n app servers. nothing. no complexity! and a whole world of issues fewer to troubleshoot! and running on that machine, even if they are docker, your apps will be fast! accessing ur apps within ur 10gbe lan will be fast.
ok im probably overdoing it
but yes instead of running 2 x named yourself u shld use cloudflare free tier la..
ah which begs the question why anyone use anything other than bind? yes windows AD users have no choice. but for everyone else, bind zone files are universal right? unless i have really good reasons i dont want to know the format that unbound or dnsmasq use right?