Questions tagged [cluster-computing]

A computer cluster is a set of connected systems that work together so that in many respects they can be viewed as a single system.

A computer cluster consists of a set of loosely connected computers that work together so that in many respects they can be viewed as a single system. Cluster management is centralized as opposed to a grid's non-central approach. (wikipedia).

5527 questions
165
votes
4 answers

Database cluster and load balancing

What is database clustering? If you allow the same database to be on 2 different servers how do they keep the data between synchronized. And how does this differ from load balancing from a database server perspective?
159
votes
4 answers

Docker-Swarm, Kubernetes, Mesos & Core-OS Fleet

I am relatively new to all these, but I'm having troubles getting a clear picture among the listed technologies. Though, all of these try to solve different problems, but do have things in common too. I would like to understand what are the things…
B_B
  • 2,013
  • 3
  • 14
  • 13
121
votes
6 answers

What is the difference between Cloud, Grid and Cluster?

What is the difference between Cloud, Cluster and Grid? Please give some examples of each as the definition of cloud is very broad. As answered in another question, can I call Dropbox, Gmail, Facebook, Youtube, Rapidshare etc. a Cloud? What are the…
SMUsamaShah
  • 7,677
  • 22
  • 88
  • 131
105
votes
1 answer

How to setup workers for parallel processing in R using snowfall and multiple Windows nodes?

I’ve successfully used snowfall to setup a cluster on a single server with 16 processors. require(snowfall) if (sfIsRunning() == TRUE) sfStop() number.of.cpus <- 15 sfInit(parallel = TRUE, cpus = number.of.cpus) stopifnot( sfCpus() ==…
jclouse
  • 2,289
  • 1
  • 20
  • 25
87
votes
5 answers

"OSError: [Errno 17] File exists" when trying to use os.makedirs

I have several threads running in parallel from Python on a cluster system. Each python thread outputs to a directory mydir. Each script, before outputting checks if mydir exists and if not creates it: if not os.path.isdir(mydir): …
user248237
86
votes
9 answers

Scaling solutions for MySQL (Replication, Clustering)

At the startup I'm working at we are now considering scaling solutions for our database. Things get somewhat confusing (for me at least) with MySQL, which has the MySQL cluster, replication and MySQL cluster replication (from ver. 5.1.6), which is…
Eran Galperin
  • 86,251
  • 24
  • 115
  • 132
85
votes
5 answers

MPI: blocking vs non-blocking

I am having trouble understanding the concept of blocking communication and non-blocking communication in MPI. What are the differences between the two? What are the advantages and disadvantages?
lamba
  • 1,581
  • 5
  • 18
  • 29
77
votes
11 answers

ZooKeeper alternatives? (cluster coordination service)

ZooKeeper is a highly available coordination service for data centers. It originated in the Hadoop project. One can implement locking, fail over, leader election, group membership and other coordination issues on top of it. Are there any…
71
votes
9 answers

Use qdel to delete all my jobs at once, not one at a time

This is a rather simple question but I haven't been able to find an answer. I have a large number of jobs running in a cluster (>20) and I'd like to delete them all and start over. According to this site I should be able to just do: qdel -u…
Gabriel
  • 40,504
  • 73
  • 230
  • 404
60
votes
5 answers

Web App: High Availability / How to prevent a single point of failure?

Can someone explain to me how high-availability ("HA") works for a web application ... because I assume HA means that there exist no single-point-of-failure. However, even if a load balancer is used- isn't that the single point of failure?
nickb
  • 9,140
  • 11
  • 39
  • 48
59
votes
5 answers

64-bit JVM limited to 300GB of memory?

I am attempting to run a Java application on a cluster computing environment (IBM LSF running CentOS release 6.2 Final) that can provide me with up to 1TB of RAM space. I could create a JVM with up to 300GB of maximum memory (Xmx), although I need…
critichu
  • 716
  • 6
  • 9
53
votes
10 answers

PHP sessions in a load balancing cluster - how?

OK, so I've got this totally rare an unique scenario of a load balanced PHP website. The bummer is - it didn't used to be load balanced. Now we're starting to get issues... Currently the only issue is with PHP sessions. Naturally nobody thought of…
Vilx-
  • 104,512
  • 87
  • 279
  • 422
50
votes
2 answers

What's the meaning of "Locality Level"on Spark cluster

What's the meaning of the title "Locality Level" and the 5 status Data local --> process local --> node local --> rack local --> Any?
fanhk
  • 745
  • 1
  • 10
  • 15
47
votes
4 answers

AWS ECS Task Memory Hard and Soft Limits

I'm confused about the purpose of having both hard and soft memory limits for ECS task definitions. IIRC the soft limit is how much memory the scheduler reserves on an instance for the task to run, and the hard limit is how much memory a container…
maambmb
  • 881
  • 1
  • 8
  • 18
47
votes
3 answers

Spread vs MPI vs zeromq?

In one of the answers to Broadcast like UDP with the Reliability of TCP, a user mentions the Spread messaging API. I've also run across one called ØMQ. I also have some familiarity with MPI. So, my main question is: why would I choose one over…
Ben Collins
  • 20,538
  • 18
  • 127
  • 187
1
2 3
99 100